Comparing 4cd5e5b48e...13953f012d - mesa

fran/mesa

Author	SHA1	Message	Date
Emil Velikov	13953f012d	docs: use correct year for the 12.0.6 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-24 02:05:20 +00:00
Emil Velikov	36e3f2542d	docs: add sha256 checksums for 12.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-24 02:02:48 +00:00
Emil Velikov	555885a0bf	docs: add release notes for 12.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-24 01:32:02 +00:00
Emil Velikov	ab62405953	Update version to 12.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-24 01:28:48 +00:00
Kenneth Graunke	806de4a224	i965: Properly flush in hsw_pause_transform_feedback(). Fixes a number of transform feedback tests when run with Linux 4.8, which allows us to use the MI_LOAD_REGISTER_REG command, at which point we started using this new broken path. ES3-CTS.functional.transform_feedback.array_element.interleaved.lines.* and Piglit's arb_transform_feedback2/draw-auto are both fixed by this patch, for example. Thanks to Chris Wilson for catching this mistake! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (cherry picked from commit `2138347a45`)	2017-01-20 22:47:28 +00:00
Chad Versace	2b87bb9b90	anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0 The spec implicitly allows the incoming count to be 0. From the Vulkan 1.0.38 spec, Section 4.1 Physical Devices: If the value referenced by pQueueFamilyPropertyCount is not 0 [then do stuff]. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `d6545f2345`)	2017-01-20 22:40:51 +00:00
Emil Velikov	689ca381b5	egl/wayland: use the destroy_window_callback for swrast As described in commit `690ead4a13` ("egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.") if we attempt to destroy a EGL surface attached to already destroyed Wayland window we'll get a segfault. v2: set the correct callback alongside the window->private. (Dan) Cc: Daniel Stone <daniels@collabora.com> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> (cherry picked from commit `bfd6314350`)	2017-01-20 22:21:58 +00:00
Emil Velikov	cc2894d376	automake: use shared llvm libs for make distcheck Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `23dcce0c03`)	2017-01-13 05:27:03 +00:00
Chad Versace	febf22ff55	i965/mt: Disable HiZ when sharing depth buffer externally (v2) intel_miptree_make_shareable() discarded and disabled CCS. Fix it so that it discards and disables HiZ too. Fixes dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer on Skylake. v2: Actually do what the commit message says. Discard the HiZ buffer. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98329 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: Nanley Chery <nanley.g.chery@intel.com Cc: Haixia Shi <hshi@chromium.org> (cherry picked from commit `42011be1e2`) [Emil Velikov: patch is a backport by Chad of above commit]	2017-01-13 05:27:02 +00:00
Chad Versace	3c7b53bba3	i965/mt: Disable aux surfaces after making miptree shareable The entire goal of intel_miptree_make_shareable() is to permanently disable the miptree's aux surfaces. So set intel_mipmap_tree:disable_aux_buffers after the function's done with discarding down the aux surfaces. References: https://bugs.freedesktop.org/show_bug.cgi?id=98329 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Nanley Chery <nanley.g.chery@intel.com Cc: Haixia Shi <hshi@chromium.org> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `1c8be049be`)	2017-01-13 05:27:02 +00:00
Emil Velikov	c880deef41	get-typod-pick-list.sh: add new script Typos do happen as people nominate patches for stable. This script aims to catch most of those. Due to the subtle nature of things, one has to pay special attention to the output, similar to get-extra-pick-list.sh. At the moment only the following is handled: grep -i "CC:.*mesa-dev" Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `f0bdd13fdb`)	2017-01-13 05:27:02 +00:00
Ilia Mirkin	09973d9a99	nouveau: take extra push space into account for pushbuf_space calls Ever since a long time ago when I messed around with fences, I ensure that after a PUSH_SPACE call there is enough space to write a fence out into the pushbuf. However the PUSH_SPACE macro is not all-knowing, and so sometimes we have to invoke nouveau_pushbuf_space manually with the relocs/pushes args set. If we don't take the extra allocation from PUSH_SPACE into account, then we will end up accidentally flushing when the code was not expecting a flush. This can lead to various runtime and rendering failures. The amount of extra allocation isn't that important - it has to be at least 8 based on the current nouveau_winsys.h setting, but even more won't hurt. I just rounded up to powers of 2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99354 Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Ben Skeggs <bskeggs@redhat.com> (cherry picked from commit `eb60a89bc3`)	2017-01-13 05:27:02 +00:00
Kenneth Graunke	36a54c27fd	spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass. vtn_ssa_value() can produce variable loads, and the cursor might be after a return statement, causing nir_builder assert failures about not inserting instructions after a jump. This fixes: dEQP-VK.spirv_assembly.instruction.graphics.barrier.in_if dEQP-VK.spirv_assembly.instruction.graphics.barrier.in_switch Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `203c128781`)	2017-01-13 05:27:02 +00:00
Fredrik Höglund	57708155d2	dri3: Fix MakeCurrent without a default framebuffer In OpenGL 3.0 and later it is legal to make a context current without a default framebuffer. This has been broken since DRI3 support was introduced. Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `b6670157d7`)	2017-01-13 05:27:02 +00:00
Michel Dänzer	cf18ee4fcc	cso: Don't restore nr_samplers in cso_restore_fragment_samplers If info->nr_samplers > ctx->nr_fragment_samplers_saved, the assignment would prevent cso_single_sampler_done from unbinding the no longer used samplers from the driver, which could result in use-after-free. This is probably unlikely to happen in practice though. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `3d661a12be`)	2017-01-13 05:27:02 +00:00
Jason Ekstrand	76816e70a9	anv/descriptor_set: Write the state offset in the surface state free list. When Kristian reworked descriptor set allocation, somehow he forgot to actually store the offset in the free list. Somehow, this completely missed CTS testing until now... This fixes all 2744 of the new 'dEQP-VK.texture.filtering.* tests in the latest CTS. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (cherry picked from commit `37537b7d86`)	2017-01-13 05:27:02 +00:00
Jason Ekstrand	c0934035a5	anv/device: Implicitly unmap memory objects in FreeMemory From the Vulkan spec version 1.0.32 docs for vkFreeMemory: "If a memory object is mapped at the time it is freed, it is implicitly unmapped." Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `b1217eada9`)	2017-01-13 05:27:02 +00:00
Jason Ekstrand	08a9f69a8b	anv/device: Return the right error for failed maps Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `920f34a2d9`)	2017-01-13 05:27:02 +00:00
Jason Ekstrand	d780f89966	spirv/nir: Add support for ImageQuerySamples Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `9e05e51cff`)	2017-01-13 05:27:01 +00:00
Jason Ekstrand	a02edabb67	spirv/nir: Handle texture projectors Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `71202352c8`)	2017-01-13 05:27:01 +00:00
Jason Ekstrand	70bb67febc	nir/spirv: Refactor coordinate handling in handle_texture Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `36c31b8fa2`)	2017-01-13 05:27:01 +00:00
Jason Ekstrand	9126479017	spirv/nir: Refactor type handling in handle_texture Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `b820c8b78c`)	2017-01-13 05:27:01 +00:00
Jason Ekstrand	f76da483a2	spirv/nir: Move opcode selection higher up in handle_texture Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `561be50a1a`)	2017-01-13 05:27:01 +00:00
Jason Ekstrand	4ac5633618	anv/image: Assert that the image format is actually supported Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `c8da91aa24`)	2017-01-13 05:27:01 +00:00
Jason Ekstrand	32d7a060fa	spirv/nir: Don't increment coord_components for array lod queries For lod query instructions, we really don't care whether or not the sampler is an array type because that doesn't factor into the LOD. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `34a39e91ba`)	2017-01-13 05:27:01 +00:00
Jason Ekstrand	eb96145c74	i965: Get rid of the do_lower_unnormalized_offsets pass We can do this in NIR now. No need to keep a GLSL pass lying around for it. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `67b7d876e4`)	2017-01-13 05:27:01 +00:00
Jason Ekstrand	ddd048bbf5	i965/nir: Enable NIR lowering of txf and rect offsets This fixes the following piglit tests on gen6+: tex-miplevel-selection textureProjGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRectShadow tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4 tex-miplevel-selection textureProjGradOffset 2DRectShadow Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `9f32721f86`)	2017-01-13 05:27:01 +00:00
Jason Ekstrand	236ecd3c4e	nir/lower_tex: Add support for lowering coordinate offsets On i965, we can't support coordinate offsets for texelFetch or rectangle textures. Previously, we were doing this with a GLSL pass but we need to do it in NIR if we want those workarounds for SPIR-V. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `d9156efc52`)	2017-01-13 05:27:01 +00:00
Jason Ekstrand	81e78ee65c	nir/lower_tex: Add some helpers for working with tex sources Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `843fc8f3e7`)	2017-01-13 05:27:00 +00:00
Jason Ekstrand	6ebb536800	nir: Add a helper for determining the type of a texture source Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `09135cd55a`)	2017-01-13 05:27:00 +00:00
Jason Ekstrand	89a8fd71af	anv/pipeline: Set binding_table.gather_texture_start This should get texture gather working on gen8+ and mostly working on gen7. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `3c0077a6ec`)	2017-01-13 05:27:00 +00:00
Jason Ekstrand	8a293e6a0c	spirv/nir: Properly handle gather components Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `95e9d58bdb`)	2017-01-13 05:27:00 +00:00
Jason Ekstrand	231ace7eec	spirv/nir: Add support for shadow samplers that return vec4 While SPIR-V technically doesn't support "old style" shadow, the shadow-compare gather instruction does return a vec4 so we need to be able to set the old_style_shadow bit in NIR. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `7c7acf53b2`)	2017-01-13 05:27:00 +00:00
Jason Ekstrand	c07386e2c8	spirv/nir: Fix some texture opcode asserts We can't get an lod with txf_ms and SPIR-V considers textureGrad to be an explicit-LOD texturing instruction. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org> (cherry picked from commit `2ddefd03b7`)	2017-01-13 05:27:00 +00:00
Nicolai Hähnle	bb4195ca26	radeonsi: enable WQM in PS prolog when needed WQM is needed when the PS prolog computes a VGPR that is consumed by a shader with (implicit or explicit) derivatives. Depends on http://reviews.llvm.org/D20839 / LLVM r272063 for this to be effective (otherwise it's just a no-op). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Cc: 12.0 <mesa-dev@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `b42bc90b6a`)	2017-01-13 05:27:00 +00:00
Matt Turner	0386f956b3	i965/fs: Reject copy propagation into SEL if not min/max. We shouldn't ever see a SEL with conditional mod other than GE (for max) or L (for min), but we might see one with predication and no conditional mod. total instructions in shared programs: 8241806 -> 8241902 (0.00%) instructions in affected programs: 13284 -> 13380 (0.72%) HURT: 62 total cycles in shared programs: 84165104 -> 84166244 (0.00%) cycles in affected programs: 75364 -> 76504 (1.51%) helped: 10 HURT: 34 Fixes generated code in at least Sanctum 2, Borderlands 2, Goat Simulator, XCOM: Enemy Unknown, and Shogun 2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92234 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `7bed52bb5f`)	2017-01-13 05:27:00 +00:00
Matt Turner	eb9127d224	i965/fs: Add unit tests for copy propagation pass. Pretty basic, but it's a start. Acked-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `091a8a04ad`) [Emil Velikov: s/gen_device_info/brw_device_info/, nir_shader_create() has only three arguments] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-13 05:25:00 +00:00
Matt Turner	d37d8d81d5	i965/fs: Rename opt_copy_propagate -> opt_copy_propagation. Matches the vec4 backend, cmod propagation, and saturate propagation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `6014da50ec`) [Emil Velikov: resolve trivial conflicts - don't rename instances which do not exist] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/mesa/drivers/dri/i965/brw_fs.cpp	2016-12-16 22:35:24 +00:00
Marek Olšák	630c41e2aa	gallium/radeon: fix the draw-calls HUD query reported by kisak on irc, it only applies to stable, not master Fix separated/backported from commit `4140afd04b` ("gallium/radeon: add driver queries for compute/dma call stats and spills") Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>	2016-12-16 18:52:25 +00:00
Marek Olšák	d278c15a17	radeonsi: disable the constant engine (CE) on Carrizo and Stoney It must be disabled until the kernel bug is fixed, and then we'll enable CE based on the DRM version. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `31f988a9d6`)	2016-12-16 18:52:25 +00:00
Marek Olšák	ce56dfca9a	radeonsi: disable CE on SI + AMDGPU Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `49c798e902`) Nominated-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-16 18:52:25 +00:00
Marek Olšák	3197612a1a	radeonsi: fix incorrect FMASK checking in bind_sampler_states Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `38d4859b94`)	2016-12-16 18:52:25 +00:00
Marek Olšák	6d919a6fc6	radeonsi: always restore sampler states when unbinding sampler views Cc: 13.0 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b3a2aa9cba`)	2016-12-16 18:52:24 +00:00
Marek Olšák	f71c3734ce	cso: don't release sampler states that are bound This fixes random radeonsi GPU hangs in Batman Arkham: Origins (Wine) and probably many other games too. cso_cache deletes sampler states when the cache size is too big and doesn't check which sampler states are bound, causing use-after-free in drivers. Because of that, radeonsi uploaded garbage sampler states and the hardware went bananas. Other drivers may have experienced similar issues. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (cherry picked from commit `6dc96de303`)	2016-12-16 18:40:42 +00:00
Emil Velikov	6b1c3c3aa0	docs: add sha256 checksums for 12.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 15:38:13 +00:00
Emil Velikov	01579a9d00	docs: add release notes for 12.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 15:31:47 +00:00
Emil Velikov	cd9a116558	Update version to 12.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 15:25:21 +00:00
Marek Olšák	4a5cce8bd5	radeonsi: silence runtime warnings with LLVM 3.9 Such as: Warning: LLVM emitted unknown config register: 0x4 This is a non-intrusive back port of commit `0f7a6ea5e7`.	2016-12-05 13:15:35 +00:00
Marek Olšák	b4c28b1755	radeonsi: disable RB+ blend optimizations for dual source blending This fixes dual source blending on Stoney. The fix was copied from Vulkan. The problem was discovered during internal testing. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `5e5573b1bf`)	2016-12-05 13:13:11 +00:00
Marek Olšák	4f71f93878	radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending copied from Vulkan Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `ff50c44a5f`)	2016-12-05 13:12:21 +00:00
Marek Olšák	a9e5a98c19	radeonsi: always set all blend registers better safe than sorry Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `87b208a54e`) Conflicts: src/gallium/drivers/radeonsi/si_state.c	2016-12-05 13:11:05 +00:00
Nanley Chery	c1cb184488	mesa/fbobject: Update CubeMapFace when reusing textures Framebuffer attachments can be specified through FramebufferTexture* calls. Upon specifying a depth (or stencil) framebuffer attachment that internally reuses a texture, the cube map face of the new attachment would not be updated (defaulting to TEXTURE_CUBE_MAP_POSITIVE_X). Fix this issue by actually updating the CubeMapFace field. This bug manifested itself in BindFramebuffer calls performed on framebuffers whose stencil attachments internally reused a depth texture. When binding a framebuffer, we walk through the framebuffer's attachments and update each one's corresponding gl_renderbuffer. Since the framebuffer's depth and stencil attachments may share a gl_renderbuffer and the walk visits the stencil attachment after the depth attachment, the uninitialized CubeMapFace forced rendering to TEXTURE_CUBE_MAP_POSITIVE_X. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77662 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> (cherry picked from commit `63318d34ac`)	2016-12-02 21:23:10 +00:00
Marek Olšák	e3ef7da79c	gallium/radeon: add support for sharing textures with DCC between processes v2: use a function for calculating WORD1 of bo metadata [Lyude] On Fedora 24 and 25, I ended up noticing some rather nasty graphical glitches on my desktop (using an R9 380 w/ amdgpu, Mesa version 12.0.4) while I was in Wayland where the content of windows was garbled, as seen here: https://people.freedesktop.org/~lyudess/archive/11-30-2017/amdgpu-fix-example.png After doing some reverse bisecting with Mesa v13, I ended up tracking down the fix to this patch, which seems to fix the problem entirely on all of the systems I've tested. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Lyude <lyude@redhat.com> CC: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `095803a37a`)	2016-12-02 19:57:55 +00:00
Matt Turner	9666f75b1b	anv: Replace "abi_versions" with correct "api_version". git history shows "abi_versions" was used from the outset. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98415 Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `07755237d3`)	2016-12-02 19:57:55 +00:00
Marek Olšák	0afbb9d052	radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `b425b57d1e`)	2016-12-02 19:57:55 +00:00
Marek Olšák	bd114e6be6	radeonsi: fix a crash in imageSize for cubemap arrays Sometimes it was f32, other times it was i32. Now it's always i32. This fixes: GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `3e756f09d4`)	2016-12-02 19:57:55 +00:00
Marek Olšák	29bac28a04	radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader This fixes: GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation .gl_PatchVerticesIn Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `03708deed2`)	2016-12-02 19:57:55 +00:00
Marek Olšák	31aa3c014b	gallium/radeon: set VPORT_ZMIN/MAX registers correctly Calculate depth ranges from viewport states and pipe_rasterizer_state::clip_halfz. The evergreend.h change is required to silence a warning. This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `687c4be9cf`)	2016-12-02 19:57:55 +00:00
Marek Olšák	b65a812d60	gallium/radeon: unify viewport emission code Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `8b0507672e`)	2016-12-02 19:57:54 +00:00
Haixia Shi	5dd6e23ad8	mesa: change state query return value for RGB565 The GL_BGR and GL_UNSIGNED_SHORT_5_6_5_REV are not defined anywhere in OpenGL ES 3.2 (or earlier) specification, and there are no known extensions in the Khronos registry that would add these enums as valid responses for glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_TYPE) and glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_FORMAT) queries. Note that this patch does not change the bit layout returned by the query. As defined by the GL spec, the bit layout of GL_RGB + GL_UNSIGNED_SHORT_5_6_5 and GL_BGR + GL_UNSIGNED_SHORT_5_6_5_REV are identical. TEST=dEQP-GLES3.functional.state_query.integers.* Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Stéphane Marchesin <marcheu@chromium.org> Change-Id: I81bbc8ccdc7e125edaeae443baf6fa8fdefcc6b6 (cherry picked from commit `8c56ff643b`)	2016-12-02 19:57:54 +00:00
Adam Jackson	422b584c00	glx/glvnd: Fix dispatch function names and indices As this array was not actually sorted, FindGLXFunction's binary search would only sometimes work. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Adam Jackson <ajax@redhat.com> (cherry picked from commit `8bca8d89ef`)	2016-12-02 19:57:54 +00:00
Adam Jackson	b1bced0d1f	glx/glvnd: Don't modify the dummy slot in the dispatch table Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Adam Jackson <ajax@redhat.com> (cherry picked from commit `deb0eb1660`)	2016-12-02 19:57:54 +00:00
Steinar H. Gunderson	9baee818b6	Fix races during _mesa_HashWalk(). There is currently no protection against walking a hash (using _mesa_HashWalk()) and modifying it at the same time, for instance by inserting or deleting elements. This leads to segfaults in multithreaded code if e.g. someone calls glTexImage2D (which may have to walk the list of FBOs) while another thread is calling glDeleteFramebuffers on another thread with the two contexts sharing lists. The reason for this is that _mesa_HashWalk() doesn't actually take the mutex that normally protects the hash; it takes an entirely different mutex. Thus, walks are only protected against other walks, and there is also no outer lock taking this. There is an old comment saying that this is to fix problems with deadlock if the callback needs to take a mutex; we solve this by changing the mutex to be recursive. A demonstration Helgrind hit from a real application: ==13412== Possible data race during write of size 8 at 0x3498C6A8 by thread #1 ==13412== Locks held: 2, at addresses 0x1AF09530 0x2B3DF400 ==13412== at 0x1F040C99: _mesa_hash_table_remove (hash_table.c:395) ==13412== by 0x1EE98174: _mesa_HashRemove_unlocked (hash.c:350) ==13412== by 0x1EE98174: _mesa_HashRemove (hash.c:365) ==13412== by 0x1EE2372D: _mesa_DeleteFramebuffers (fbobject.c:2669) ==13412== by 0x6105AA4: movit::ResourcePool::cleanup_unlinked_fbos(void*) (resource_pool.cpp:473) ==13412== by 0x610615B: movit::ResourcePool::release_fbo(unsigned int) (resource_pool.cpp:442) [...] ==13412== This conflicts with a previous read of size 8 by thread #20 ==13412== Locks held: 2, at addresses 0x1AF09558 0x1AF73318 ==13412== at 0x1F040CD9: _mesa_hash_table_next_entry (hash_table.c:415) ==13412== by 0x1EE982A8: _mesa_HashWalk (hash.c:426) ==13412== by 0x1EED6DFD: _mesa_update_fbo_texture.part.33 (teximage.c:2683) ==13412== by 0x1EED9410: _mesa_update_fbo_texture (teximage.c:3043) ==13412== by 0x1EED9410: teximage (teximage.c:3073) ==13412== by 0x1EEDA28F: _mesa_TexImage2D (teximage.c:3105) ==13412== by 0x166A68: operator() (mixer.cpp:454) There are many more interactions than just these two possible. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Steinar H. Gunderson <steinar+mesa@gunderson.no> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `2e2562cabb`)	2016-12-02 19:57:54 +00:00
Jason Ekstrand	68dd6ad433	anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4 This fixes hangs in Dota2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a6c3d0f92b`)	2016-12-02 19:57:54 +00:00
Jason Ekstrand	6bcdb0611f	anv/cmd_buffer: Take a command buffer instead of a batch in two helpers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `1e3e347fd5`)	2016-12-02 19:57:54 +00:00
Emil Velikov	0703bab2cd	cherry-ignore: add reverted LLVM_LIBDIR patch The patch was reverted shortly after it was merged. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-02 19:57:54 +00:00
Anuj Phogat	a7b662633e	i965: Fix GPU hang related to multiple render targets and alpha testing This patch should have been the part of commit `e592f7df`. In a situation when there are multiple render targets with alpha testing enabled, if fragment shader doesn't write to draw buffer zero, it causes the GPU hang on SKL. No GPU hang is seen on HSW. Simulator gives a warning for all gen6+ h/w: "Illegal render target write message length 0xa expected 0xc" This patch fixes the GPU hang as well as the simulator warning with new piglit test fbo-mrt-alphatest-no-buffer-zero-write: https://patchwork.freedesktop.org/patch/118212 No regressions in Jenkins CI system. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (cherry picked from commit `b9df2251c1`)	2016-12-02 19:57:54 +00:00
Marek Olšák	faa684802f	radeonsi: fix an assertion failure in si_decompress_sampler_color_textures This fixes a crash in Deus Ex: Mankind Divided. Release builds were unaffected, so it's not too serious. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `00baaa4752`)	2016-12-02 19:57:54 +00:00
Jason Ekstrand	9a844035c0	i965/fs/generator: Don't use the address immediate for MOV_INDIRECT The address immediate field is only 9 bits and, since the value is in bytes, the highest GRF we can point to with it is g15. This makes it pretty close to useless for MOV_INDIRECT. There were already piles of restrictions preventing us from using it prior to Broadwell, so let's get rid of the gen8+ code path entirely. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97779 Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `2a4a86862c`)	2016-12-02 19:57:54 +00:00
Tim Rowley	5f4284fd36	swr: [rasterizer] add support for llvm-3.9 v2: use signed compare, remove unneeded vmask Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com> (cherry picked from commit `f810907669`)	2016-12-02 19:57:54 +00:00
Tim Rowley	a4cd90283a	swr: [rasterizer jitter] fix llvm-3.7 compile d3d97f8 broke llvm-3.7, which has a mismatched API for setDataLayout/getDataLayout. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com> (cherry picked from commit `ae4f2c849a`)	2016-12-02 19:57:53 +00:00
Tim Rowley	0934f29c50	swr: [rasterizer jitter] cleanup supporting different llvm versions Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> (cherry picked from commit `d3d97f8395`)	2016-12-02 19:57:53 +00:00
Kenneth Graunke	e6bc5248aa	intel: Fix pixel shader scratch space allocation on Gen9+ platforms. We had missed a bit of errata - PS scratch needs to be computed as if there were 4 subslices per slice, rather than 3. This is a conservative backport of commit `aaee3daa90`. It only increases the scratch amount, unlike the original commit which decreases it on Skylake GT1-3 to avoid overallocating. Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-30 07:01:47 -08:00
Marek Olšák	352902218e	gallium/radeon: make sure HTILE address is aligned properly This should fix random GPU hangs on Hawaii and Fiji. It's already been fixed in 13.0 and later. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-23 18:58:24 +01:00
Marek Olšák	6e77fbc8d7	gallium/radeon: fix behavior of GLSL findLSB(0) This is for 12.0 and older. A different commit fixes 13.0 and newer. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>	2016-11-11 22:41:45 +01:00
Emil Velikov	7b9d7257b2	docs: add sha256 checksums for 12.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-11 01:55:08 +00:00
Emil Velikov	3776e97f9d	docs: add release notes for 12.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-11 01:53:32 +00:00
Jonathan Gray	20370d4f1b	mesa: automake: include mesa_glinterop.h in distfile Add mesa_glinterop.h to the list of headers that will get included in the distfile as it is required to build Mesa itself. Corrects a regression introduced in `a89faa2022`. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `23392abf50`)	2016-11-10 21:57:37 +00:00
Emil Velikov	72539c5e38	Update version to 12.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-10 21:03:41 +00:00
Dave Airlie	e7c1408870	Revert "st/vdpau: use linear layout for output surfaces" This reverts commit `d180de3532`. This is a radeon specific hack that causes problems on nouveau when combined with the SHARED flag later. If radeonsi needs a fix for this, please fix it in the driver. [chk] Using linear surfaces for this makes sense because tilling isn't beneficial and the surfaces can potentially be shared with other GPUs using the VDPAU OpenGL interop. [airlied] I think we need a flag that isn't SHARED/LINEAR that is more SHARED_OTHER_GPU. [mareko] Does radeonsi need PIPE_BIND_VIDEO_DECODE_OUTPUT that it would translate into linear ? [mareko] My only concern is decoding performance. If the decoder works in 64x1 blocks, tiling will hurt. That's the theory. I don't know how the decoder works. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (I+A) (cherry picked from commit `d0d5f7600c`)	2016-11-08 20:45:03 +00:00
Marek Olšák	422e4da25c	glx: make interop ABI visible again This was broken when the GLAPI use was removed from mesa_glinterop.h. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `64c2593a5c`)	2016-11-08 20:45:03 +00:00
Marek Olšák	1040360e9f	egl: make interop ABI visible again This was broken when the GLAPI use was removed from mesa_glinterop.h. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `ee39d4456e`)	2016-11-08 20:45:03 +00:00
Marek Olšák	ad7d21bc3a	egl: use util/macros.h I need the definition of PUBLIC. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `bf51b45313`) [Emil Velikov: Keep the MIN2 macro] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-08 20:44:27 +00:00
Jason Ekstrand	3d4a219dd8	intel/blorp: Rework our usage of ralloc when compiling shaders Previously, we were creating the shader with a NULL ralloc context and then trusting in blorp_compile_fs to clean it up. The only problem was that blorp_compile_fs didn't clean up its context properly so we were leaking. When I went to fix that, I realized that it couldn't because it has to return the shader binary which is allocated off of that context and used by the caller. The solution is to make blorp_compile_fs take a ralloc context, allocate the nir_shaders directly off that context, and clean it all up in whatever function creates the shader and calls blorp_compile_fs. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0, 13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `43dadb6edd`) [Emil Velikov: attribute src/intel/blorp file movement] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/intel/blorp/blorp.c src/intel/blorp/blorp_clear.c src/intel/blorp/blorp_priv.h src/mesa/drivers/dri/i965/brw_blorp_blit.cpp	2016-11-08 16:23:23 +00:00
Samuel Pitoiset	76a77249ed	nvc0/ir: fix emission of IMAD with NEG modifiers The emitter tried to emit sub instead of subr when src0 has actually a NEG modifier. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 12.0 13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `84e946380b`)	2016-11-08 16:23:23 +00:00
Tapani Pälli	c1f138149e	egl: set preserved behavior for surface only if config supports it Otherwise we can end up with mismatching behavior between config and surface when client queries surface attributes. As example, configs for DRI3 do not support preserved behavior but here we were setting preserved behavior for pixmap and pbuffer. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326 Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Tested-by: Mark Janes <mark.a.janes@intel.com> (cherry picked from commit `2035930966`)	2016-11-08 16:23:22 +00:00
Emil Velikov	3a27b813b4	cherry-ignore: add ClientWaitSync fixes Patches (and extension overall) depends on gallium API and driver work which hasn't landed in branch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-08 16:23:22 +00:00
Marek Olšák	ac3abe534b	winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures Maybe this is why SDMA has been broken for many amdgpu users? SDMA is the only block which is used with imported textures and relies on this variable. DB also uses it, but it doesn't get imported textures, so it's unaffected. I do get SDMA failures on Tonga before this patch if R600_DEBUG=testdma is changed to use imported textures. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `6ec3b2a4b1`) [Emil Velikov: resolve trivial conflicts - SI support does not exist in branch] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Conflicts: src/gallium/winsys/amdgpu/drm/amdgpu_surface.c	2016-11-08 16:23:22 +00:00
Marek Olšák	20008a9fb8	gallium/radeon: make sure the address of separate CMASK is aligned properly This should fix random GPU hangs on Hawaii and Fiji. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `dce05b3423`)	2016-11-08 16:23:22 +00:00
Samuel Pitoiset	79a1cc2364	nvc0: use correct bufctx when invalidating CP textures Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `7b2712c367`)	2016-11-08 16:23:22 +00:00
Tapani Pälli	cac5e31b0f	mesa: fix error handling in DrawBuffers Patch rearranges error checking so that enum checking provided via destmask happens before other checks. It needs to be done in this order because other error checks do not work properly if there were invalid enums passed. Patch also refines one existing check and it's documentation to match GLES 3.0 spec (also in later specs). This was somewhat mysteriously referring to desktop GL but had a check for gles3. Fixes following dEQP tests: dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers no CI regressions observed. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98134 Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `a1652a059e`)	2016-11-08 16:23:22 +00:00
Tapani Pälli	979e4b9c3f	egl: add check that eglCreateContext gets a valid config Fixes following dEQP test: dEQP-EGL.functional.negative_api.create_context v2: don't break EGL_KHR_no_config_context (Eric Engestrom) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `5876f3c85a`) [Emil Velikov: drop EGL_NO_CONFIG_KHR, use MESA_configless_context] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Conflicts: src/egl/main/eglapi.c	2016-11-08 16:23:22 +00:00
Emil Velikov	71d0b5f7c7	cherry-ignore: add N/A EGL revert Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-08 16:23:22 +00:00
Tapani Pälli	8fbad64732	egl/dri2: set max values for pbuffer width and height While these max values were previously fixed for pbuffer creation, this change makes also eglGetConfigAttrib() return correct values. Fixes following dEQP tests: dEQP-EGL.functional.create_surface.pbuffer.rgb888_no_depth_no_stencil dEQP-EGL.functional.create_surface.pbuffer.rgb888_depth_stencil dEQP-EGL.functional.create_surface.pbuffer.rgba8888_no_depth_no_stencil dEQP-EGL.functional.create_surface.pbuffer.rgba8888_depth_stencil Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326 Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b91e1e38e8`)	2016-11-08 16:23:21 +00:00
Axel Davy	a106d8c872	st/nine: Fix locking CubeTexture surfaces. Only one face of Cubetextures was locked when in DEFAULT Pool. Fixes: https://github.com/iXit/Mesa-3D/issues/129 CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr> (cherry picked from commit `eed605a473`)	2016-11-08 16:23:21 +00:00
Axel Davy	5abbe84671	st/nine: Fix mistake in Volume9 UnlockBox In the format fallback path, the height was used instead of the depth. CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr> (cherry picked from commit `fe7bb46134`)	2016-11-08 16:23:21 +00:00
Jonathan Gray	7133f0054d	mapi: automake: set VISIBILITY_CFLAGS for shared glapi shared glapi was previously built without setting CFLAGS for AM_CFLAGS and VISIBILITY_CFLAGS. This resulted in symbols being exported that shouldn't be. The x86 and sparc assembly versions of the dispatch table partially mitigated this by using .hidden. Otherwise shared_dispatch_stub_* were being exported. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "11.2 12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (cherry picked from commit `907ace5798`)	2016-11-08 16:23:21 +00:00
Stencel, Joanna	b5cb4f5980	egl/wayland: add missing destroy_window callback The original patch by Joanna added the function pointer and callback yet things got only partially applied - the infra was added, but the implementation was missing. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Fixes: `690ead4a13` ("egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `2e0ab61e29`)	2016-11-08 16:23:21 +00:00
Emil Velikov	d2be28c2bd	automake: don't forget to pick wglext.h in the tarball Earlier commit reworked the header install rules, to ensure that the correct ones are installed only as needed. By doing so it dropped a wildcard which was effectively including the wglext.h header in the tarball. Add the header to the top-level noinst_HEADERS, since the it is not meant to be installed (autoconf is not used on Windows plaforms). Fixes: `a89faa2022` ("autoconf: Make header install distinct for various APIs (v2)") Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Cc: Chuck Atkins <chuck.atkins@kitware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `3511a86111`)	2016-11-08 16:23:21 +00:00
Emil Velikov	cd2db885bf	get-pick-list.sh: Require explicit "12.0" for nominating stable patches A nomination unadorned with a specific version is now interpreted as being aimed at the 12.0 branch, which was recently opened. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-08 16:23:21 +00:00
Nicolai Hähnle	648f012459	radeonsi: fix 64-bit loads from LDS Fixes spec/arb_tessellation_shader/execution/dvec[23]-vs-tcs-tes, among others. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `4a2dbfff05`)	2016-11-08 16:23:21 +00:00
Nicolai Hähnle	f8f9d7528a	st/mesa: only set primitive_restart when the restart index is in range Even when enabled, primitive restart has no effect when the restart index is larger than the representable values in the index buffer. Fixes GL45-CTS.gtf31.GL3Tests.primitive_restart.primitive_restart_upconvert for radeonsi VI. v2: add an explanatory comment Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) (cherry picked from commit `bfa50f88ce`)	2016-11-08 16:23:20 +00:00
Nicolai Hähnle	3a030e886d	st/glsl_to_tgsi: fix block copies of arrays of doubles Set the type of the left-hand side to the same as the right-hand side, so that when the base type is double, the writemask of the MOV instruction is properly fixed up. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `ca592af880`)	2016-11-08 16:23:20 +00:00
Ilia Mirkin	4d478aad50	nv50/ir: process texture offset sources as regular sources With ARB_gpu_shader5, texture offsets can be any source, including TEMPs and IN's. Make sure to process them as regular sources so that we pick up masks, etc. This should fix some CTS tests that feed offsets directly to textureGatherOffset, and we were not picking up the input use, thus not advertising it in the shader header. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dave Airlie <airlied@redhat.com> Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `cd45d758ff`)	2016-11-08 16:23:20 +00:00
Ilia Mirkin	e8ae2da8a0	nv50,nvc0: avoid reading out of bounds when getting bogus so info The state tracker tries to attach the info to the wrong shader. This is easy enough to protect against. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `313fba5ee1`)	2016-11-08 16:23:20 +00:00
Jonathan Gray	34cb65716e	genxml: add generated headers to EXTRA_DIST Building the Mesa 12.0.3 distfile failed on a system without python as generated files were not included in the distfile. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `41754f743f`)	2016-11-08 16:23:20 +00:00
Ilia Mirkin	4ddcb9cb22	gm107/ir: fix bit offset of tex lod setting for indirect texturing Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `8c78fdb328`)	2016-11-08 16:23:20 +00:00
Ilia Mirkin	aeabbc1e1d	gm107/ir: fix texturing with indirect samplers The indirect handle has to come right after the coordinates, so if there was a sample/bias/depth compare/offset, everything would end up being shifted by one argument position. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `ecea2f69ef`)	2016-11-08 16:23:20 +00:00
Kenneth Graunke	9cecb50bf2	i965: Fix gl_InvocationID in dual object GS where invocations == 1. dEQP-GLES31.functional.geometry_shading.instanced.geometry_1_invocations draws using a geometry shader that specifies layout(points, invocations = 1) in; and then uses gl_InvocationID. According to the Haswell PRM, the "GS Instance ID 0" (and 1) thread payload fields are undefined in dual object mode: "If 'dispatch mode' is DUAL_OBJECT this field is not valid." But there's no point in using them - if there's only one invocation, the ID will be 0. So just load a constant. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (cherry picked from commit `9f677d6541`)	2016-11-08 16:23:20 +00:00
Nicolai Hähnle	c403863348	st/glsl_to_tgsi: fix atomic counter addressing When more than one atomic counter buffer is in use, UniformStorage[n].opaque is set up to contain indices that are contiguous across all used buffers. This appears to be used by i965 via NIR, but for TGSI we do not treat atomic counter buffers as opaque, so using the data in the opaque array is incorrect. Fixes GL45-CTS.compute_shader.resource-atomic-counter. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `1dd99a15a4`)	2016-11-08 16:23:19 +00:00
Nicolai Hähnle	7c7973606f	radeonsi: fix indirect loads of 64 bit constants This fixes GL45-CTS.compute_shader.fp64-case3. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `51f9b38ce8`)	2016-11-08 16:23:19 +00:00
Chad Versace	341889d6ca	egl: Don't advertise unsupported platform extensions Mesa's set of supported platform extensions depends on the autoconf option --with-egl-platforms=foo,bar,baz. If --with-egl-platforms lacks foo, then eglGetPlatformDisplay(EGL_PLATFORM_FOO, ...) unconditonally fails. So, if --with-egl-platforms lacks foo, then remove EGL_VENDOR_platform_foo from the EGL client extension string. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `c177ef9d47`) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/egl/main/eglglobals.c	2016-11-08 16:23:19 +00:00
Emil Velikov	866aee0264	egl/x11: don't crash if dri2_dpy->conn is NULL The dri3 version of commits `60e9c35b3a` and `6de9a03bed`. While using xcb_connect() guarantees that we always get a non NULL return value, XGetXCBConnection() does/can not. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> (cherry picked from commit `b10c05d4ff`)	2016-11-08 16:23:19 +00:00
Emil Velikov	32a469b8ed	isl/gen6: correctly check msaa layout samples count Samples == 1 is a valid value, so returning false is plain wrong. Seeming copy/paste typo introduced since day 1. Fixes: `afdadec77f` ("isl: Implement isl_surf_init() for gen4-gen9") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> (cherry picked from commit `84f9ef1de4`)	2016-11-08 16:23:19 +00:00
Vinson Lee	eb9236e275	Revert "mesa_glinterop: remove inclusion of GLX header" This reverts commit `8472045b16`. Conflicts: include/GL/mesa_glinterop.h This patch fixes this build error with GCC 4.4. Compiling src/glx/dri_common_interop.c ... In file included from src/glx/dri_common_interop.c:33: include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’ include/GL/glx.h:165: note: previous declaration of ‘GLXContext’ was here Fixes: `8472045b16` ("mesa_glinterop: remove inclusion of GLX header") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770 Signed-off-by: Vinson Lee <vlee@freedesktop.org> (cherry picked from commit `c10dcb2ce8`) Squashed with commit: mesa_glinterop: allow building without X and related headers This commit effectively reverts `c10dcb2ce8` and fixes the typedef redefinition which inspired it. In order to prevent requiring X packages at build time earlier commit forward declared the required X/GLX typedefs. Since that approach introduced typedef redefinition (a C11 feature) it was reverted. To avoid the redefinition while _not_ mandating X and related headers forward declare the structs and use those through the header. As anyone uses the mesa interop header they ensure that the X (or others in terms of EGL) headers are included, which ensures that everything is resolved within the compilation unit. Cc: Vinson Lee <vlee@freedesktop.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Chih-Wei Huang <cwhuang@android-x86.org> Fixes: `c10dcb2ce8` ("Revert "mesa_glinterop: remove inclusion of GLX header"") Fixes: `8472045b16` ("mesa_glinterop: remove inclusion of GLX header") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `c85b34ffd0`)	2016-11-08 16:23:19 +00:00
Mario Kleiner	50afa72f3c	glx: Perform check for valid fbconfig against proper X-Screen. Commit `cf804b4455` ('glx: fix crash with bad fbconfig') introduced a check in glXCreateNewContext() if the given config is a valid fbconfig. Unfortunately the check always checks the given config against the fbconfigs of the DefaultScreen(dpy), instead of the actual X-Screen specified in the config config->screen. This leads to failure whenever a GL context is created on a non-DefaultScreen(dpy), e.g., on X-Screen 1 of a multi-x-screen setup, where the default screen is typically 0. Fix this by using config->screen instead of DefaultScreen(dpy). Tested to fix context creation failure on a dual-x-screen setup. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `0c94ed0987`)	2016-11-08 16:23:19 +00:00
Dave Airlie	80409971c0	anv/wsi: fix apps that acquire multiple images up front This fix was found in the radv codebase when running dota2, no idea if anyone has reported it on anv, but the same problem occurs. Once an image is acquired we need to mark it busy. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `8980ac0411`) Squashed with commit anv: fix the wayland wsi busy flag setting Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `a3834ebaf9`)	2016-11-08 16:23:19 +00:00
Dave Airlie	920150f28a	anv: initialise and increment send_sbc At least set this to not be uninitialised memory. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `dfe74fd1a9`)	2016-11-08 16:23:18 +00:00
Marek Olšák	12bdcc105c	radeonsi: disable ReZ This is a serious performance fix. Discovered by luck. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `e12c1cab5d`)	2016-11-08 16:23:18 +00:00
Nicolai Hähnle	1cb2a483ba	st/mesa: fix vertex elements setup for doubles Whether one or two slots are taken up by one API array depends on the vertex shader, not on how the array is configured. When an array is set up with fewer components than the shader expects, the high components are undefined. Fixes GL45-CTS.vertex_attrib_binding.basic-inputL-case1. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `d413fbb159`)	2016-11-08 16:23:18 +00:00
Nicolai Hähnle	79e84bafc1	st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `1d7685e52c`)	2016-11-08 16:23:18 +00:00
Nicolai Hähnle	5181624675	st/glsl_to_tgsi: simplify translate_tex_offset This fixes a bug with offsets from uniforms which seems to have only been noticed as a crash in piglit's arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag on radeonsi. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `b234e37765`)	2016-11-08 16:23:18 +00:00
Ilia Mirkin	d18d830744	nvc0/ir: fix textureGather with a single offset Recent fix for non-const offsets broke the case of a single offset (vs 4 offsets). The later code relies on the offs array to contain null values to tell whether they should be added onto the srcs list. Fixes: `5239bd592` ("nvc0/ir: fix overwriting of value backing non-constant gather offset") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `a48a343c29`)	2016-11-08 16:23:18 +00:00
Ilia Mirkin	bef0cc6287	nv50/ir: copy over value's register id when resolving merge of a phi The offset needs to be properly copied over to the phi value, otherwise it will get assigned to the base of the merge instead of the proper location. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `300b5ad023`)	2016-11-08 16:23:18 +00:00
Nicolai Hähnle	f0f5bd9607	st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations This optimization is incorrect with 64-bit operations, because the channel-splitting logic in emit_asm ends up being applied twice to the source operands. A lucky coincidence of how the writemask test works resulted in this optimization basically never being applied anyway. As far as I can tell, the only case where it would (incorrectly) have been applied is something like dvec2 d; float x = (float)d.y; which nobody seems to have ever done. But the moral equivalent does occur in one of the component layout piglit test. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `63193b9cde`)	2016-11-08 16:23:18 +00:00
Axel Davy	ca135ebd76	st/nine: Fix the calculation of the number of vs inputs Fixes hangs on radeonsi, and assert on llvmpipe. Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `d2fd296648`)	2016-11-08 16:23:17 +00:00
Axel Davy	ec2751f967	gallium/util: Really allow aliasing of dst for u_box_union_* Gallium nine relies on aliasing to work with this function. Without this patch, dirty region tracking was incorrect, which could lead to incorrect textures or vertex buffers. Fixes several game bugs with nine. Fixes https://github.com/iXit/Mesa-3D/issues/234 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `2290eac84e`)	2016-11-08 16:23:17 +00:00
Ilia Mirkin	24a16d0a9d	nvc0/ir: fix overwriting of value backing non-constant gather offset Normally the value is an immediate, which is moved to some temporary, so there's no problem. In the case of a non-constant offset (as allowed by ARB_gpu_shader5), we have to take care to copy it first before using it to build up the bits. This fixes a compilation error observed in F1 2015. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `5239bd5920`)	2016-11-08 16:23:17 +00:00
Eric Anholt	9c5de2546c	gallium: Fix install-gallium-links.mk on non-bash /bin/sh Debian uses dash by default, which doesn't do '+='. Fixes servo's osmesa-based headless testing system, which was looking for libOSMesa in the lib/ directory. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `ec9ed1c4d8`)	2016-11-08 16:23:17 +00:00
Martin Peres	17f59da9db	loader/dri3: import prime buffers in the currently-bound screen This tries to mirrors the codepath taken by DRI2 in IntelSetTexBuffer2() and fixes many applications when using DRI3: - Totem with libva on hw-accelerated decoding - obs-studio, using Window Capture (Xcomposite) as a Source - gstreamer with VAAPI v2: - introduce get_dri_screen() in the dri3 loader's vtable (krh) Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Tested-by: Ionut Biru <biru.ionut@gmail.com> Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71759 Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com> (cherry picked from commit `a599b1c203`)	2016-11-08 16:23:17 +00:00
Martin Peres	0ae4d909ea	loader/dri3: add get_dri_screen() to the vtable This allows querying the current active screen from the loader's common code. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com> (cherry picked from commit `0247e5ee3e`)	2016-11-08 16:23:17 +00:00
Chuck Atkins	2fedb106bb	autoconf: Make header install distinct for various APIs (v2) This fixes a problem where GL headers would only get installed if glx was enabled. So if osmesa was enabled but not glx, then the GL headers required by osmesa would be missing from the install. v2: Dropped unneeded mesa_glinterop.h redundant osmesa.h install Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `a89faa2022`)	2016-11-08 16:23:17 +00:00
Chad Versace	c5de4cbf32	i965/sync: Fix uninitalized usage and leak of mutex We locked an unitialized mutex in the callstack glClientWaitSync intel_gl_client_wait_sync brw_fence_client_wait_sync because we forgot to initialize it in intel_gl_fence_sync. (The EGLSync codepath didn't have this bug. It initialized the mutex in intel_dri_create_sync). We also forgot to tear down (mtx_destroy) the mutex when destroying the sync object. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `ce1d67c2e5`)	2016-11-08 16:23:17 +00:00
Marek Olšák	e7b274c552	radeonsi: fix texture border colors for compute shaders There are VM faults without this. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `cc4a19c4ad`)	2016-11-08 16:23:16 +00:00
Marek Olšák	63c1a5d391	radeonsi: fix interpolateAt opcodes for .zw components Not returning garbage in .zw seems pretty important. This fixes: GL45-CTS.shader_multisample_interpolation.render.interpolate_at__check. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `1b37e5541c`)	2016-11-08 16:23:16 +00:00
Kenneth Graunke	d16be6898b	i965: Add missing BRW_CS_PROG_DATA to CS work group surface atom. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `bab1c05634`)	2016-11-08 16:23:16 +00:00
Kenneth Graunke	4b0512ade4	i965: Add missing BRW_NEW_CS_PROG_DATA to compute constant atom. CACHE_NEW_CS_PROG hasn't existed in quite a long time...the old comment was there, but not the actual bit. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `ce6c80ebbb`)	2016-11-08 16:23:16 +00:00
Emil Velikov	39344de587	cherry-ignore: add update_renderbuffer_read_surfaces() The function (and underlying work) is not in branch. Former introduced with `786108e7b2`. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-08 16:23:16 +00:00
Kenneth Graunke	915683485f	i965: Move BRW_NEW_FRAGMENT_PROGRAM from 3DSTATE_PS to PS_EXTRA. 3DSTATE_PS doesn't need this. 3DSTATE_PS_EXTRA however does, for brw_color_buffer_write_enabled(). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `0047d600af`)	2016-11-08 16:23:16 +00:00
Kenneth Graunke	bc04c92aef	i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `28e1538be7`) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/mesa/drivers/dri/i965/gen6_clip_state.c	2016-11-08 16:23:16 +00:00
Kenneth Graunke	5fb22e1258	i965: Fix missing _NEW_TRANSFORM in Gen8+ 3DSTATE_DS atom. Needed for user clip plane enables. Broken since this code was introduced. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `78df96256b`)	2016-11-08 16:23:16 +00:00
Chad Versace	b5de58d7e1	egl: Fix truncation error in _eglParseSyncAttribList64 The function stores EGLAttrib values in EGLint variables. On 64-bit systems, this truncated the values. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `69adb9a778`)	2016-11-08 16:23:16 +00:00
Emil Velikov	082ea77cdf	cherry-ignore: add EGL_KHR_debug fix The extension landed after the branchpoint. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-08 16:23:15 +00:00
Nicolai Hähnle	a5c0b8784a	gallium/radeon: cleanup and fix branch emits Some of the existing code is needlessly complicated. The basic principle should be: control-flow opcodes emit branches to properly terminate the current block, _unless_ the current block already has a terminator (which happens if and only if there was a BRK or CONT). This also fixes a bug where multiple terminators were created in a block. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `6f87d7a146`) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c	2016-11-08 16:23:15 +00:00
James Legg	020550e099	radeonsi: Fix primitive restart when index changes If primitive restart is enabled for two consecutive draws which use different primitive restart indices, then the first draw's primitive restart index was incorrectly used for the second draw. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025 Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `e33f31d61f`)	2016-11-08 16:23:15 +00:00
Matt Whitlock	e7491c3bbd	gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `42ed8a6c9c`)	2016-11-08 16:23:15 +00:00
Matt Whitlock	39c0535646	st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `ac6064f918`)	2016-11-08 16:23:15 +00:00
Matt Whitlock	7c27d56535	st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `0c060f691c`)	2016-11-08 16:23:15 +00:00
Matt Whitlock	ea3e778bff	gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `5d0069eca2`)	2016-11-08 16:23:15 +00:00
Matt Whitlock	d82738fbd9	egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `c8fd7d060d`)	2016-11-08 16:23:15 +00:00
Jason Ekstrand	4e1298be04	nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocks Previously, we were saving off the last nir_block in a vtn_block before moving on so that we could find the nir_block again when it came time to handle phi sources. Unfortunately, NIR's control flow modification code is inconsistent when it comes to how it splits blocks so the block pointer we saved off may point to a block somewhere else in the shader by the time we get around to handling phi sources. In order to get around this, we insert a nop instruction and use that as the logical end of our block. Since the control flow manipulation code respects instructions, the nop will keeps its place like any other instruction and we can easily find the end of our block when we need it. This fixes a bug triggered by a couple of vkQuake shaders. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97233 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Tested-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `6ffbfc760d`)	2016-11-08 16:23:14 +00:00
Jason Ekstrand	17429a22a6	nir: Add a nop intrinsic This intrinsic has no destination, no sources, no variables, and can be eliminated. In other words, it does nothing and will always get deleted by dead code elimination. However, it does provide a quick-and-easy way to temporarily tag a particular location in a NIR shader. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `7697b4b98b`) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/compiler/nir/nir_intrinsics.h	2016-11-08 16:23:14 +00:00
Tapani Pälli	d8d7de3b29	egl: stop claiming support for pbuffer + msaa This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test and same crash in many dEQP EGL tests. I also found that some Qt example did a workaround because of this crash: https://bugreports.qt.io/browse/QTBUG-47509 v2: Ian pointed out that v1 removed support for all multisample configs, including window ones. This one removes pbuffer bit when adding configs, now only pbuffer+msaa gets rejected and window+msaa continues to work. Fixed also comment (Emil) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `4d6d55deef`)	2016-11-08 16:23:14 +00:00
Jason Ekstrand	5e4aeeb8ec	nir/spirv/cfg: Detect switch_break after loop_break/continue While the current CFG code is valid in the case where a switch break also happens to be a loop continue, it's a bit suboptimal. Since hardware is capable of handling the continue as a direct jump, it's better to use a continue instruction when we can than to bother with all of the nasty switch break lowering. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `ef3c5ac7fb`)	2016-11-08 16:23:14 +00:00
Jason Ekstrand	12d09e24f8	nir/spirv/cfg: Handle switches whose break block is a loop continue It is possible that the break block of a switch is actually the continue of the loop containing the switch. In this case, we need to identify the break block as a continue and break out of current level of CFG handling. If we don't, the continue portion of the loop will get handled twice, once by following after the break and a second time by the loop handling code handling it explicitly. This fixes 6 of the new Vulkan CTS tests: - dEQP-VK.spirv_assembly.instruction.graphics.opphi.out_of_order* - dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order* Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `4d02faede5`)	2016-11-08 16:23:14 +00:00
Eric Anholt	cd88ea6d82	travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> (cherry picked from commit `78ab62b1e9`)	2016-11-08 16:23:14 +00:00
Eric Anholt	c004014db4	travis: Enable vc4 in libdrm to satisfy vc4 test build dependency. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> (cherry picked from commit `084678ccbb`)	2016-11-08 16:23:14 +00:00
Eric Anholt	6931dab9b4	travis: Update to the Ubuntu Trusty image. This will hopefully fix wget from x.org (no real reason explained in Travis CI bug reports), and may also mean that we can enable LLVM driver builds. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> (cherry picked from commit `80a872f3f0`)	2016-11-08 16:23:14 +00:00
Eric Anholt	a8010d31ce	travis: Parse configure.ac to pick an updated LIBDRM_VERSION. Travis has been broken a couple of times by configure.ac updates. To make it useful, auto-update the version necessary. This could potentially be used for other dependencies, too, but those get bumped less frequently. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> (cherry picked from commit `ecbc76cf6e`)	2016-11-08 16:23:13 +00:00
Ian Romanick	0909e54c3c	glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept At this point in the code, s must be visit_continue. If the child returned visit_stop, visit_stop is the only correct thing to return. Found by inspection. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `ea6ed2379d`)	2016-11-08 16:23:13 +00:00
Tim Rowley	6cd0438840	configure.ac: add llvm inteljitevents component if enabled Needed to successfully link llvmpipe or swr when using shared llvm libs built with inteljitevents enabled. v2: Make adding inteljitevents component global rather than just llvmpipe/swr, since libgallium will have a symbol dependency. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `bacdd9ef4c`)	2016-11-08 16:23:13 +00:00
Nicholas Bishop	f228c90f80	st/dri: check pipe_screen->resource_get_handle() return value Change dri2_query_image to check the return value of resource_get_handle and return GL_FALSE if an error occurs. For reference this is an example callstack that should propagate the error back to the user: i915_drm_buffer_get_handle i915_texture_get_handle u_resource_get_handle_vtbl dri2_query_image gbm_dri_bo_get_fd gbm_bo_get_fd Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Nicholas Bishop <nbishop@neverware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) [Emil Velikov: Split from larger patch, polish coding style, cc stable] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `aa560e8e63`) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/gallium/state_trackers/dri/dri2.c	2016-11-08 16:23:13 +00:00
Nicholas Bishop	29320aa06a	gbm: return appropriate error when queryImage() fails Change gbm_dri_bo_get_fd to check the return value of queryImage and return -1 (an invalid file descriptor) if an error occurs. Update the comment for gbm_bo_get_fd to return -1, since (apart from the above) we've already return -1 on error. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Nicholas Bishop <nbishop@neverware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) [Emil Velikov: Split from larger patch, polish coding style, cc stable] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2d05ba2ca0`)	2016-11-08 16:23:13 +00:00
Emil Velikov	48f001b836	cherry-ignore: add vaapi encode fix The encode codepaths landed after the branch point. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-08 16:23:13 +00:00
Brian Paul	0823c98a65	st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj() Some demos, like Heaven, were creating and destroying a large number of sampler views because of a swizzle issue. Basically, we compute the sampler view's swizzle by examining the texture format, user swizzle, depth mode, etc. Later, during validation we recompute that swizzle (in case something like depth mode changes) and see if it matches the view's swizzle. In the case of PIPE_FORMAT_RGTC2_UNORM, get_texture_format_swizzle returned SWIZZLE_XYZW but the u_sampler_view_default_template() function was setting the sampler view's swizzle to SWIZZLE_XY01. This mismatch caused the validation step to always "fail" so we'd destroy the old sampler view and create a new one. By removing the conditional, the sampler view's swizzle and the computed texture swizzle match and validation "passes". When creating a new sampler view, we always want to use the texture swizzle which we just computed. Fixes VMware issue 1733389. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Charmaine Lee <charmainel@vmware.com> (cherry picked from commit `1cdc232e1a`)	2016-11-08 16:23:13 +00:00
Samuel Pitoiset	a287f820b8	gk110/ir: fix wrong emission of OP_NOT This should emit src0 instead of src1. Found by inspection. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `d8b4f5fcca`)	2016-11-08 16:23:13 +00:00
Samuel Pitoiset	725ef1fbba	nvc0/ir: fix subops for IMAD Offset was wrong, it's at bit 8, not 4. Also, uses subr instead of sub when src2 has neg. Similar to GK110 now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `50baaf6bc6`)	2016-11-08 16:23:13 +00:00
Marek Olšák	a4a2378408	mesa: fix glGetFramebufferAttachmentParameteriv w/ on-demand FRONT_BACK alloc This fixes 66 CTS tests on st/mesa. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `d58a3906cb`)	2016-11-08 16:23:12 +00:00
Emil Velikov	cc34777cec	cherry-ignore: add non-applicable i965 commit Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-08 16:23:12 +00:00
Hans de Goede	cd614f31b8	pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms Make pipe_loader_sw_probe_kms take ownership of the passed in fd, like pipe_loader_drm_probe_fd does. The only caller is dri_kms_init_screen which passes in a dupped fd, just like dri2_init_screen passes in a dupped fd to pipe_loader_drm_probe_fd. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `459cc94507`) Squashed with commit: pipe_loader_sw: Don't invoke Unix close() on Windows. Trivial. (cherry picked from commit `c6d17701c8`)	2016-11-08 16:23:12 +00:00
Vedran Miletić	67c99c1245	clover: Fix build against clang SVN >= r273191 setLangDefaults() now requires PreprocessorOptions as an argument. Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com> (cherry picked from commit `82e0bbd01a`) Nominated-by: Andreas Boll <andreas.boll.dev@gmail.com> Nominated-by: Timo Aaltonen <tjaalton@ubuntu.com>	2016-11-08 16:23:12 +00:00
Kenneth Graunke	6a72af2aeb	mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness. This is supposed to be exposed with the GL_KHR_robustness extension, which we support on ES 2.0 and later. On desktop GL, it's also exposed by GL_ARB_robustness, which is supported by all drivers ("dummy_true"). so we also allow desktop GL. Fixes: - ES32-CTS.robust.robustness.noResetNotification - ES32-CTS.robust.robustness.loseContextOnReset Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `3bcdc2e3db`) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/mesa/main/get.c src/mesa/main/get_hash_params.py	2016-11-08 16:23:12 +00:00
Kenneth Graunke	e591b0b206	nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar(). This is mandatory. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `e6eed3533e`)	2016-11-08 16:23:12 +00:00
Brendan King	96aa7ca98c	configure.ac: fix the name of the Wayland Scanner pc file The Wayland Scanner pkg-config file is called wayland-scanner.pc. Fixes: `153539bd9d` ("configure: rework wayland_scanner handling (fix make distcheck)") Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Brendan King <Brendan.King@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `95f3e5861c`)	2016-11-08 16:23:12 +00:00
Marek Olšák	b1c5719d7b	radeonsi: fix FP64 UBO loads with indirect uniform block indexing No known tests. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `15a127bc2c`) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/gallium/drivers/radeonsi/si_shader.c	2016-11-08 16:23:12 +00:00
Ilia Mirkin	ec1f6700b6	st/mesa: fix is_scissor_enabled when X/Y are negative Similar to commit `49c24d8a24` ("i965: fix noop_scissor range issue on width/height") - take the X/Y into account to determine whether the scissor covers the whole area or not. Fixes the recently-added gl-1.0-scissor-depth-clear-negative-xy piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `742832434a`)	2016-11-08 16:23:11 +00:00
Julien Isorce	b2495c2202	st/va: also honors interlaced preference when providing a video format This fixes a crash when using the prefered video format with vaapisink on Nvidia hardwares. Also caught by the following assert: nouveau_vp3_video.c:91: Assertion `templat->interlaced' failed. TEST= gst-launch-1.0 videotestsrc ! video/x-raw, format=NV12 ! vaapisink Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com> Tested-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> (cherry picked from commit `bf901a2f8c`)	2016-11-08 16:23:11 +00:00
Chuanbo Weng	7ad97bc307	gbm: fix potential NULL deref of mapImage/unmapImage. The mapImage/unmapImage functions of DRIimage extension can be NULL, so we should add additional check for them. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `9a1eb54237`)	2016-11-08 16:23:11 +00:00
Ilia Mirkin	106f2dc8a7	gm107/ir: AL2P writes to a predicate register We have to force it to write to predicate 7 (aka PT) in order for it not to mess up another predicate. Unclear what would be returned in the predicate, perhaps an error code for out-of-bounds requests. Blob doesn't seem to check it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `a22aee5ad1`)	2016-11-08 16:23:11 +00:00
Marek Olšák	a3c232db2f	radeonsi: take compute shader and dispatch indirect memory usage into account Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `e62caf576e`) Squashed with commit: radeonsi: flush TC L2 before using a compute indirect buffer There is no known test for this. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `08bcbfdc07`)	2016-11-08 16:23:11 +00:00
Max Staudt	f75a108434	r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering On the RSxxx chip series, HW TCL is missing and r300_emit_vs_state() is never called. However, if R300_VAP_CNTL is never set, the hardware (at least the RS690 I tested this on) comes up with rendering artifacts, and parts that are uploaded before this "fix" remain broken in VRAM. This causes artifacts as in fdo#69076 ("triangle flickering"). It seems like this setup needs to happen at least once after power on for 3D rendering to work properly. In the DDX with EXA, this happens in RADEON_SWITCH_TO_3D() when processing an XRENDER Composite or an Xv request. So playing back a video or starting a GTK+2 application fixes 3D rendering for the rest of the session. However, this auto-fix doesn't happen when EXA is not used, such as with GLAMOR or Wayland. This patch ensures the register is configured even in absence of the DDX's EXA module. The register setting is taken from: xf86-video-ati -- RADEONInit3DEngineInternal() mesa/src/mesa/drivers/dri/r300 -- r300EmitClearState() Tested on RS690. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Max Staudt <mstaudt@suse.de> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `02675622b0`)	2016-11-08 16:23:11 +00:00
Jason Ekstrand	258e651e0f	nir/spirv: Refactor variable deocration handling Previously, we dind't apply variable decorations to the members of a split structure variable. This doesn't quite work, unfortunately, because things such as the "flat" qualifier may get applied to an entire structure instead of propagated to the members. This fixes 9 of the new CTS tests in the dEQP-VK.glsl.linkage.varying.struct.* group. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a00bd7bc27`)	2016-11-08 16:23:11 +00:00
Jason Ekstrand	662a7c627b	nir/spirv: Break variable decoration handling into a helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `f5505730d3`)	2016-11-08 16:23:11 +00:00
Ilia Mirkin	ed8e99761d	nir: fix definition of pack_uvec2_to_uint Found by inspection. Untested beyond compilation. This also matches the logic used in nir_lower_alu_to_scalar. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `8c8874eafb`)	2016-11-08 16:23:10 +00:00
Ilia Mirkin	98c9fcf259	mesa/formatquery: limit ES target support, fix core context support First off, as late as ES 3.2, GetInternalformat only supports RENDERBUFFER and 2DMS(_ARRAY) targets. Secondly, the _mesa_has_ext helpers are very accurate... a little too accurate, some might say. If we only show an extension in compat profiles because core profiles have the functionality guaranteed, they will return false. Fix these to either check for a core profile explicitly, or to a different-but-identical extension available in core profile. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matteo Bruni <matteo.mystral@gmail.com> Tested-by: Matteo Bruni <matteo.mystral@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `c42acd93d4`)	2016-11-08 16:23:10 +00:00
Ilia Mirkin	5eabc81d50	main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer Add a separate extension check for that format. Prevents glTexImage from trying to find a matching format, which fails on drivers without support for this format. Fixes: sized-texture-format-channels (on a3xx) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `36347c8d6f`)	2016-11-08 16:23:10 +00:00
Jason Ekstrand	2fbce4c9e1	nir/spirv: Use the correct sources for CompareExchange on images The CompareExchange operation has two "Memory Semantics" parameters instead of one so the real arguments start at w[7] instead of w[6]. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `f2a10937d8`)	2016-11-08 16:23:10 +00:00
Jason Ekstrand	ab6126bb1d	nir/spirv: Swap the argument order for AtomicCompareExchange SPIR-V has the two arguments in the opposite order from GLSL. NIR uses the GLSL order so we had them backwards. Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `0ead7bef6b`)	2016-11-08 16:23:10 +00:00
Marek Olšák	f5de7da4e1	radeonsi: fix cubemaps viewed as 2D This fixes: GL43-CTS.texture_view.view_sampling v2: fix a typo, merge both if statements Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <airlied@redhat.com> (v1) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `a4fa215058`)	2016-11-08 16:23:10 +00:00
Ilia Mirkin	05ec6a7c03	a3xx: use window scissor to simulate viewport xy clip Unfortunately a3xx does not have a separate disable for depth clipping, so when depth clamp is enabled, we disable the whole 3d clipper logic. This in turn also gets rid of the xy clip that it would normally do. When we detect this would happen, instead we integrate the viewport into the window scissor. This may have slightly different behavior around wide points, but it's unlikely that anything depends on this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `ca313e00b6`) [Emil Velikov: s\|batch->\|\|g] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/gallium/drivers/freedreno/a3xx/fd3_emit.c	2016-11-08 16:23:10 +00:00
Ilia Mirkin	dee992caa1	a3xx: make use of software clipping when hw can't handle it The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs, or a clip vertex, or clip distances are in use, then we must use the fallback discard-based clipping from the frag shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `83d7230fd5`)	2016-11-08 16:23:10 +00:00
Ilia Mirkin	68620d14d4	a3xx: make sure to actually clamp depth as requested We were previously ... not clamping. I guess this meant that everything got clamped to 1/0, which was enough to pass the existing tests. Or perhaps the clamping would only happen to the rasterized depth value and not the frag shader's output depth value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `dac72234c7`) [Emil Velikov: s\|batch->\|\|g] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-08 16:23:10 +00:00
Ilia Mirkin	42a890891c	nv30: set usage to staging so that the buffer is allocated in GART The code a few lines below expects to migrate the bo in question to VRAM. Since we're filling the initial data via CPU, it's more efficient to create the temporary buffer in GART. There is no "push" method implemented, otherwise we'd use that instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `6118bcab4e`)	2016-10-31 17:39:42 +00:00
Michel Dänzer	d3d33918c7	loader/dri3: Overhaul dri3_update_num_back Always use 3 buffers when flipping. With only 2 buffers, we have to wait for a flip to complete (which takes non-0 time even with asynchronous flips) before we can start working on the next frame. We were previously only using 2 buffers for flipping if the X server supports asynchronous flips, even when we're not using asynchronous flips. This could result in bad performance (the referenced bug report is an extreme case, where the inter-frame stalls were preventing the GPU from reaching its maximum clocks). I couldn't measure any performance boost using 4 buffers with flipping. Performance actually seemed to go down slightly, but that might have been just noise. Without flipping, a single back buffer is enough for swap interval 0, but we need to use 2 back buffers when the swap interval is non-0, otherwise we have to wait for the swap interval to pass before we can start working on the next frame. This condition was previously reversed. Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97260 Reviewed-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `1e3218bc5b`) Squashed with commit: loader/dri3: Always use at least two back buffers This can make a significant difference for performance with some extreme test cases such as vblank_mode=0 glxgears. Fixes: `1e3218bc5b` ("loader/dri3: Overhaul dri3_update_num_back") Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97549 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (cherry picked from commit `dc3bb5db8c`)	2016-10-31 17:39:32 +00:00
Emil Velikov	09460b8cf7	docs: add sha256 checksums for 12.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-15 11:29:24 +01:00
Emil Velikov	d79b2e7bf3	docs: add release notes for 12.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-15 10:19:26 +01:00
Emil Velikov	e487048f8c	Update version to 12.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-15 10:15:57 +01:00
Emil Velikov	71b47b9cfe	Revert "i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations" This reverts commit `be0344f630`. The commit depends on `48e9ecc47f` ("Revert "i965/miptree: Set logical_depth0 == 6 for cube maps") which was reverted earlier. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97781	2016-09-14 12:14:01 +01:00
Jose Fonseca	bde8f418bd	appveyor: Update winflexbison download URL. This particular version got moved into a `old_versions` subdirectory.	2016-09-14 11:21:04 +01:00
Emil Velikov	614fb93a6d	docs: add sha256 checksums for 12.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-05 16:03:06 +01:00
Emil Velikov	2fc6a31f10	docs: add release notes for 12.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-05 12:14:11 +01:00
Emil Velikov	63001e7ddf	Update version to 12.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-05 12:09:24 +01:00
Emil Velikov	7757de1ebf	glx/glvnd: list the strcmp arguments in correct order Currently, due to the inverse order, strcmp will produce negative result when the needle is towards the start of the haystack. Thus on the next iteration(s) we'll end up further towards the end and eventually fail to locate the entry. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (cherry picked from commit `62b224d428`)	2016-09-05 11:59:25 +01:00
Ilia Mirkin	8e9b6161eb	gk110/ir: fix quadop dall emission We recently starting to always emit the NDV (== dall) bit for quadops. However it was folded into the wrong code word. Fixes: `e0a067ed48` (nv50/ir: always emit the NDV bit for OP_QUADOP) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `61e978524a`)	2016-09-05 11:37:18 +01:00
Ilia Mirkin	7c96b11fd6	a4xx: make sure to actually clamp depth as requested We were previously ... not clamping. I guess this meant that everything got clamped to 1/0, which was enough to pass the existing tests. Or perhaps the clamping would only happen to the rasterized depth value and not the frag shader's output depth value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry-picked from `89f00f749f`) [imirkin: adjust ctx->batch to just ctx]	2016-09-05 11:37:07 +01:00
Emil Velikov	49e84b8f18	Revert "i965/miptree: Set logical_depth0 == 6 for cube maps" This reverts commit `48e9ecc47f`. The commit regressed several piglit tests on SNB/ILK hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97567	2016-09-05 11:37:06 +01:00
Jose Fonseca	463d9ea0dc	appveyor: Force Visual Studio 2013 image. It seems the default build image is now Visual Studio 2015, and Visual Studio 2013 is not installed.	2016-09-01 22:10:08 +01:00
Jose Fonseca	53e8701c7b	appveyor: Install pywin32 extensions. AppVeyor build images seem to have been upgraded to Python 2.7.12, but no longer have pywin32 pre-installed.	2016-09-01 22:10:08 +01:00
Ilia Mirkin	0fa0e2a505	nv30: only bail on color/depth bpp mismatch when surfaces are swizzled The actual restriction is a little weaker than I originally thought. See https://bugs.freedesktop.org/show_bug.cgi?id=92306#c17 for the suggestion. This also explain why things weren't always failing before, only sometimes. We will allocate a non-swizzled depth buffer for NPOT winsys buffer sizes, which they almost always are. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `8caf2cb0c0`)	2016-09-01 11:39:47 +01:00
Jason Ekstrand	9a8d605398	anv: Rework pipeline caching The original pipeline cache the Kristian wrote was based on a now-false premise that the shaders can be stored in the pipeline cache. The Vulkan 1.0 spec explicitly states that the pipeline cache object is transiant and you are allowed to delete it after using it to create a pipeline with no ill effects. As nice as Kristian's design was, it doesn't jive with the expectation provided by the Vulkan spec. The new pipeline cache uses reference-counted anv_shader_bin objects that are backed by a large state pool. The cache itself is just a hash table mapping keys hashes to anv_shader_bin objects. This has the added advantage of removing one more hand-rolled hash table from mesa. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97476 Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> (cherry picked from commit `10f9901bce`)	2016-09-01 11:39:47 +01:00
Jason Ekstrand	17d40ca82b	anv/pipeline: Add support for caching the push constant map Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> (cherry picked from commit `ffcef720b7`) [Emil Velikov: dependency for the next patch] Nominated-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-01 11:39:47 +01:00
Jason Ekstrand	a8e4b59cfd	anv: Add a struct for storing a compiled shader This new anv_shader_bin struct stores the compiled kernel (as an anv_state) as well as all of the metadata that is generated at shader compile time. The struct is very similar to the old cache_entry struct except that it is reference counted and stores the actual pipeline_bind_map. Similarly to cache_entry, much of the actual data is floating-size and stored after the main struct. Unlike cache_entry, which was storred in GPU-accessable memory, the storage for anv_shader_bin kernels comes from a state pool. The struct itself is reference-counted so that it can be used by multiple pipelines at a time without fear of allocation issues. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> (cherry picked from commit `6899718470`)	2016-09-01 11:39:47 +01:00
Jason Ekstrand	2566315063	anv: Add pipeline_has_stage guards a few places All of these worked before because they were depending on prog_data to be null. Soon, we won't be able to depend on a nice prog_data pointer and it's nice to be more explicit anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `13c09fdd0c`)	2016-09-01 11:39:47 +01:00
Jason Ekstrand	b529a77d79	anv: Remove unused fields from anv_pipeline_bind_map Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b259d86ad6`)	2016-09-01 11:39:47 +01:00
Jason Ekstrand	d159ca4fa2	anv/pipeline: Properly handle OOM during shader compilation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `d5945bec12`)	2016-09-01 11:39:47 +01:00
Jason Ekstrand	1ce414accf	anv/allocator: Correctly set the number of buckets The range from ANV_MIN_STATE_SIZE_LOG2 to ANV_MAX_STATE_SIZE_LOG2 should be inclusive and we have asserts that ensure that you never try to allocate a state larger than (1 << ANV_MAX_STATE_SIZE_LOG2). However, without adding 1 to the difference, we allocate 1 too few bucckts and so, even though we have an assert, anything landing in the last bucket will fail to allocate properly.. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a0f5c496e3`)	2016-09-01 11:39:47 +01:00
Jason Ekstrand	e12b7486b3	anv/pipeline: Fix bind maps for fragment output arrays Found by inspection. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `4200c2266e`)	2016-09-01 11:39:47 +01:00
Jason Ekstrand	1d0c79b13b	anv/descriptor_set: memset anv_descriptor_set_layout We hash this data structure so we can't afford to have uninitialized data even if it is just structure padding. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `d316cec1c1`)	2016-09-01 11:39:46 +01:00
Samuel Pitoiset	04f04ab6a6	nv50/ir: always emit the NDV bit for OP_QUADOP This silences a divergent error found with F1 2015. Basically, the NDV bit has to be set when a FSWZ instruction is inside divergent code, but it's not needed otherwise. The correct fix should be to set it only in divergent code situations. GM107 emitter already sets that bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `e0a067ed48`)	2016-09-01 11:39:46 +01:00
Emil Velikov	5af16ddf84	i915: Check return value of screen->image.loader->getBuffers Ported from the i965 commit `e7ab358e81`. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Cc: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (cherry picked from commit `5de640a518`)	2016-09-01 11:39:46 +01:00
Ilia Mirkin	178c34c535	nouveau: always enable at least one RC Experimentally, this is required for glxgears and others to display the proper colors. This is also what the code used to do before the referenced commit. Fixes: `c703658b39` (mesa: Drop _EnabledUnits.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `357d8261f1`)	2016-09-01 11:39:46 +01:00
Brian Paul	7c583adfb5	mesa: fix format conversion bug in get_tex_rgba_uncompressed() We need to set the need_convert flag with each loop iteration, not just when the rgba pointer is null. Bug reported by Markus Müller <mueller@imfusion.de> on mesa-users list. Fixes new piglit arb_texture_float-get-tex3d test. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (cherry picked from commit `b9b88516f8`)	2016-09-01 11:39:46 +01:00
Ilia Mirkin	f70585e56a	main: add missing EXTRA_END in OES_sample_variables get check Fixes: `3002296cb6` (mesa: add GL_OES_shader_multisample_interpolation support) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `05b37e20de`)	2016-09-01 11:39:46 +01:00
Jason Ekstrand	2d48468e58	isl: Allow multisampled array textures This probably isn't the only thing that needs to be done to get multisampled array textures working in Vulkan but I think this is all that ISL really needs and it does fix 8 of the new CTS tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> (cherry picked from commit `fb89551047`)	2016-09-01 11:39:46 +01:00
Ian Romanick	ab0183172f	glsl: Mark cube map array sampler types as reserved in GLSL ES 3.10 All the GLSL 4.x keywords were added to the list of reserved keywords in GLSL ES 3.10. As far as I can tell, these are the only ones that were missed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `c879dbc4e4`)	2016-09-01 11:39:46 +01:00
Miklós Máté	1d4c887020	vbo: set draw_id Fixes conditional jump depending on uninitialized value in si_state_draw.c:593 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Miklós Máté <mtmkls@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (cherry picked from commit `b9ac72b511`)	2016-09-01 11:39:46 +01:00
Chad Versace	a0e81225bd	i965: Respect miptree offsets in intel_readpixels_tiled_memcpy() Respect intel_miptree_slice::x_offset,y_offset and intel_mipmap_tree::offset. All three may be non-zero when glReadPixels is called on an EGLImage created from the non-base slice of a miptree. Patch 2/2 that fixes test 'dEQP-EGL.functional.image.create.gles2_cubemap_*'. Reported-by: Haixia Shi <hshi@chromium.org> Diagnosed-by: Haixia Shi <hshi@chromium.org> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Change-Id: I4b397b27e55a743a7094d29fb0a6a4b6b34352b0 (cherry picked from commit `5b03975889`)	2016-09-01 11:39:46 +01:00
Chad Versace	6898eb5859	i965: Fix miptree layout for EGLImage-based renderbuffers When glEGLImageTargetRenderbufferStorageOES() was given an EGLImage created from the non-base slice of a miptree, intel_image_target_renderbuffer_storage() forgot to apply the intra-tile offsets __DRIimage::tile_x,tile_y to the miptree layout. This patch fixes the problem with a quick hack suitable for cherry-picking. A proper fix requires more thorough plumbing in intel_miptree_create_layout() and brw_tex_layout(). Patch 1/2 that fixes test 'dEQP-EGL.functional.image.create.gles2_cubemap_*'. Reported-by: Haixia Shi <hshi@chromium.org> Diagnosed-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Change-Id: I8a64b0048a1ee9e714ebb3f33fffd8334036450b (cherry picked from commit `c82f99e883`)	2016-09-01 11:39:46 +01:00
Matt Turner	9aa4e400d2	nir: Walk blocks in source code order in lower_vars_to_ssa. Prior to this commit rename_variables_block() is recursively called, performing a depth-first traversal of the control flow graph. The function uses a non-trivial amount of stack space for local variables, which puts us in danger of smashing the stack, given a sufficiently deep dominance tree. XCOM: Enemy Within contains a shader with such a dominance tree (1574 nir_blocks in total, depth of at least 143). Jason tells me that he believes that any walk over the nir_blocks that respects dominance is sufficient (a DFS might have been necessary prior to the introduction of nir_phi_builder). In fact, the introduction of nir_phi_builder made the problem worse: rename_variables_block(), walks to the bottom of the dominance tree before calling nir_phi_builder_value_get_block_def() which walks back to the top of the dominance tree... In any case, this patch ensures we avoid that problem as well. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97225 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (cherry picked from commit `e53130cc27`)	2016-09-01 11:39:46 +01:00
Brian Paul	b061b2e3eb	swrast: fix incorrectly positioned putImage() in swrast driver Some front buffer rendering was in the wrong position. This included scissored clears, glDrawPixels and glCopyPixels. The problem was the y coordinate passed to putImage() didn't match the y coordinate passed to getImage(). We fix this by setting xrb->map_y to the inverted coordinate in swrast_map_renderbuffer() which is used later by the putImage() call. Also pass xrb->map_y to getImage() to be symmetric. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97426 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `2a2dc416b6`)	2016-09-01 11:39:45 +01:00
Marek Olšák	69acfb7c94	radeonsi: disable SDMA texture copying on Carrizo Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (cherry picked from commit `3ff0b67e1b`)	2016-09-01 11:39:45 +01:00
Jason Ekstrand	544a92ad49	anv: Include the pipeline layout in the shader hash The pipeline layout affects shader compilation because it is what determines binding table locations as well as whether or not a particular buffer has dynamic offsets. Since this affects the generated shader, it needs to be in the hash. This fixes a bunch of CTS tests now that the CTS is using a pipeline cache. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `2301705dee`)	2016-09-01 11:39:45 +01:00
Samuel Pitoiset	9fced1aa53	nvc0: invalidate textures/samplers on GK104+ Like Fermi, textures and samplers are aliased between 3D and compute, especially the TIC_FLUSH/TSC_FLUSH methods and we have to re-validate these resources when switching between the two pipelines. This fixes a GPU hang with Elemental (and most likely with other UE4 demos). Tested on GK107 and GM107. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> CC: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a227b0a4f1`)	2016-09-01 11:39:45 +01:00
Marek Olšák	5ad09f744c	radeonsi: fix VM faults due NULL internal const buffers on CIK They are harmless, but the interrupts do decrease performance. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97039 Cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `2c13abb491`)	2016-09-01 11:39:45 +01:00
Nicolai Hähnle	6001e37b8e	radeonsi: add si_set_rw_buffer to be used for internal descriptors So that callers outside of si_descriptors.c need to worry less about the details of descriptor handling. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `ba4a2840c7`)	2016-09-01 11:39:45 +01:00
Tomasz Figa	e45e500d4b	gallium/winsys/kms: Look up the GEM handle after importing a prime FD drmPrimeHandleToFD() will return the same GEM handle every time the same buffer is imported, even from a different prime FD. Since GEM handles are not reference counted, we need to make sure that each GEM handle is referenced only by one display target struct, by looking it up in kms_sw->bo_list first and bumping the refcount of the found dt on hit and falling back to creating a new dt only on miss. v2: Split into separate function. Use helper function for lookup. v3 [Emil Velikov]: Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan) Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `577f85e2bb`)	2016-09-01 11:39:45 +01:00
Tomasz Figa	4ec509533e	gallium/winsys/kms: Move display target handle lookup to separate function As a preparation to use the lookup in more than once place, move the code that looks up given KMS/GEM handle to a separate function. This change should not introduce any functional changes. v2: Split into separate patch. Move lookup code into separate function. v3 [Emil Velikov]: Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan) Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `0465c72d46`)	2016-09-01 11:39:45 +01:00
Tomasz Figa	731e6575e6	gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2) Currently kms_sw_displaytarget_add_from_prime() allocates the struct and fills in only some of the fields, resulting in a half-baked struct that needs to be further completed by the caller. To make this a bit more consistent, pass width, height and stride to this function and fill in everything there, so that caller can take the returned struct as is. v2: Split from one big patch into four fixing one thing at a time. Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `e71b78ebf9`)	2016-09-01 11:39:45 +01:00
Tomasz Figa	ab157ffd86	gallium/winsys/kms: Fix double refcount when importing from prime FD (v2) Currently the code creates a display target struct with refcount field initialized to 1 and then the caller again increments it, leading to a leaked reference. Let's remove the unnecessary increment. v2: Split from one big patch into four fixing one thing at a time. Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `0aa6a818ef`)	2016-09-01 11:39:45 +01:00
Ilia Mirkin	61a6d84679	nv50/ir: make sure cfg iterator always hits all blocks In some very specially-crafted cases, we could attempt to visit a node that has already been visited, and then run out of bb's to visit, while there were still cross blocks on the list. Make sure that those get moved over in that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96274 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `092f994a03`)	2016-09-01 11:39:45 +01:00
Eric Anholt	fabc5c2783	vc4: Fix leak of the bo_handles table. (cherry picked from commit `9f95690959`)	2016-09-01 11:39:45 +01:00
Rob Herring	ec68600280	vc4: add hash table look-up for exported dmabufs It is necessary to reuse existing BOs when dmabufs are imported. There are 2 cases that need to be handled. dmabufs can be created/exported and imported by the same process and can be imported multiple times. Copying other drivers, add a hash table to track exported BOs so the BOs get reused. v2: Whitespace fixup (by anholt) Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `9ace2c1355`)	2016-09-01 11:39:45 +01:00
Eric Anholt	838c1cbde4	vc4: Fix a leak of the src[] array of VPM reads in optimization. Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a0671d67de`)	2016-09-01 11:39:44 +01:00
Eric Anholt	965ceef596	vc4: Disable early Z with computed depth. We don't tell the hardware whether we're computing depth, so we need to manage early Z state manually. Fixes piglit early-z. (cherry picked from commit `ce8504d196`)	2016-09-01 11:39:44 +01:00
Eric Anholt	0f097f28eb	vc4: Close our screen's fd on screen close. We're passed in a freshly dup()ed fd on screen create, so we should close it on exit. Debugged by Hugh Cole-Baker. (cherry picked from commit `c65a00eaff`)	2016-09-01 11:39:44 +01:00
Rob Herring	7c583f85e1	vc4: fix vc4_resource_from_handle() stride calculation The expected stride calculation is completely wrong. It should ultimately be multiplying cpp and width rather than dividing. The width also needs to be aligned to the tiling width first before converting to stride bytes. The whole stride check here is possibly pointless. Any buffers which were allocated outside of vc4 may have strides with larger alignment requirements. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `067c5b10b6`)	2016-09-01 11:39:44 +01:00
Matt Turner	cdfd7c7b72	mesa: Use AC_HEADER_MAJOR to include correct header for major(). Gentoo has been smoke testing an upcoming change to glibc. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=580392 (cherry picked from commit `20553e4a2d`)	2016-09-01 11:39:44 +01:00
Stencel, Joanna	e671f40e79	egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface. Segfault occurs when destroying EGL surface attached to already destroyed Wayland window. The fix is to set to NULL the pointer of surface's native window when wl_egl_destroy_window() is called. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Stencel, Joanna <joanna.stencel@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `690ead4a13`)	2016-09-01 11:39:44 +01:00
Jason Ekstrand	42fe245370	anv/clear: Clear E5B9G9R9 images as R32_UINT We can't actually clear these images normally because we can't render to them. Instead, we have to manually unpack the rgb9e5 color value on the CPU and clear it as R32_UINT. We still have a bit of work to do to clear non-power-of-two images, but this should get all of the power-of-two clears working on at least Haswell. This fixes three of the new Vulkan CTS tests in the dEQP-VK.api.image_clearing.clear_color_image.* group. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `7bdccd104b`) [Emil Velikov: rgb9e5 header is renamed in master s/format_rgb9e5.h/u_format_rgb9e5.h/] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-01 11:39:44 +01:00
Jason Ekstrand	015b920fb4	anv/clear: Make cmd_clear_image take an actual VkClearValue Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `afa7ca0f77`)	2016-09-01 11:39:44 +01:00
Jason Ekstrand	32a0038a02	anv/blit2d: Add support for RGB destinations This fixes 104 of the new image_clearing and copy_and_blit Vulkan CTS tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `cf3cf2ecfc`)	2016-09-01 11:39:44 +01:00
Jason Ekstrand	346f9e5a85	anv/blit2d: Add a format parameter to bind_dst and create_iview Signed-off-by: Jasosn Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `16ddda8452`) [Emil Velikov: don't attribute if using ISL_TILING_W. patches that attribute and require the ISL_TILING_W handling aren't in 12.0] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/intel/vulkan/anv_meta_blit2d.c	2016-09-01 11:39:44 +01:00
Dave Airlie	7fe8cad4e4	st/glsl_to_tgsi: fix st_src_reg_for_double constant. This needs to set the src swizzle so it doesn't access the .zw members ever when we are just emitting a 0 constant here. This fixes: vert-conversion-explicit-dvec3-bvec3.shader_test and a bunch of other fp64 tests on softpipe and radeonsi. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `26187f3890`)	2016-09-01 11:39:44 +01:00
Daniel Scharrer	6847a37363	mesa: Fix fixed function spot lighting on newer hardware (again) This was first fixed in commit `b3f9c5c` and then broken again in commit `fe2d2c7`, which removed the abs modifier from input registers. v2: Don't change the size of struct ureg. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Daniel Scharrer <daniel@constexpr.org> (cherry picked from commit `16ef7ab5c1`)	2016-09-01 11:39:44 +01:00
Matt Turner	bf61f2e5c3	i965/vec4: Ignore swizzle of VGRF for use by var_range_end(). var_range_end(v, n) loops over the n components of variable number v and finds the maximum value, giving the last use of any component of v. Therefore it expects v to correspond to the variable associated with the .x channel of the VGRF. var_from_reg() however returns the variable for the first channel of the VGRF, post-swizzle. So, if the last register had a swizzle with y, z, or w in the swizzle component, we would read out of bounds. For any other register, we would read liveness information from the next register. The fix is to convert the src_reg to a dst_reg in order to call the dst_reg version of var_from_reg() that doesn't consider the swizzle. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `e7c376adfd`)	2016-09-01 11:39:43 +01:00
Emil Velikov	d2f0a2925f	cherry-ignore: temporary(?) drop "a4xx: make sure to actually clamp depth" The commit depends a 700+ patch introducing fd_batch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-01 11:39:43 +01:00
Ilia Mirkin	9e71069d8f	a4xx: only disable depth clipping, not all clipping, when requested The previous bit disables the whole clipper, including the regular viewport-related clipping that would go on. The two new bits disable near and far clipping (separately, as verified with the depth-clamp-range piglit). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `cd8e30452f`)	2016-09-01 11:39:43 +01:00
Ilia Mirkin	8d1029fb7b	vbo: add basevertex when looking up elements for vbo splitting Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97351 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `659dc10d32`)	2016-09-01 11:39:43 +01:00
Emil Velikov	16751e0be4	isl: automake: use VISIBILITY_CFLAGS to restrict symbol visibility v2: Add VISIBILITY_CFLAGS to AM_CFLAGS (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `d61d259518`) [Emil Velikov: drop not applicable gen4-6 hunks] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/intel/isl/Makefile.am	2016-09-01 11:39:43 +01:00
mil Velikov	52f094859a	anv: remove dummy VK_DEBUG_MARKER_EXT entry points The vkCmdDbgMarker{Begin,End} symbols are exported, yet the json does no advertise that the driver supports the extension. Furthermore the functions are empty stubs. Remove those until we get a proper implementation and json notation. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `ebd5dc8826`)	2016-09-01 11:39:43 +01:00
Emil Velikov	21411adc27	anv: do not export the Vulkan API With version 1 of the Loader interface there is an internal/private symbol (vk_icdGetInstanceProcAddr) which is used to retrieve all the API from the Vulkan entrypoints from the ICD. Implying that exposing the Vulkan API is not recommended. Version 2 goes a step further explicitly forbiding the ICD from exposing Vulkan symbols (and adding a negotiation API) As a reference: - Nvidia 367.35 Missing negotiation API - version 1. Exposes only vk_icdGetInstanceProcAddr. - AMD 16.30.3.306809 Have negotiation API - version 2, Exposes vk_icdGetInstanceProcAddr. Exposes a couple of Vulkan entry points - seems to be in violation with the spec. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `49394e8d77`)	2016-09-01 11:39:43 +01:00
Emil Velikov	e609ec2c1a	anv: automake: build with -Bsymbolic Explicitly suggested in the Loader interface version 2 section, but it's good idea either way. It essentially, ensures that our symbols are not interposed. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `1cdb6ca40b`)	2016-09-01 11:39:43 +01:00
Emil Velikov	610857acea	anv: automake: use VISIBILITY_CFLAGS to restrict symbol visibility Hide the internal symbols and annotate the vk_icdGetInstanceProcAddr as public since the loader needs it (since v1 of the loader interface). v2: Add VISIBILITY_CFLAGS to AM_CFLAGS (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `40e4fff563`)	2016-09-01 11:39:43 +01:00
Emil Velikov	de1f9ce703	anv: remove internal 'validate' layer Presently the layer has only a single entry point. As mentioned by Jason the function does not validate anything that isn't checked elsewhere, thus we can drop the whole thing. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b0d56f2f4f`)	2016-09-01 11:39:43 +01:00
Kenneth Graunke	951b508e44	i965: Fix barrier count shift in scalar TCS backend. The "Barrier Count" field goes in 14:9 of m0.2. The vec4 backend correctly shifts by 9, but the scalar backend only shifted by 8. It's not like this changed - I think I just made a typo when writing the original scalar TCS backend code. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit `d14dd727f4`)	2016-09-01 11:39:43 +01:00
Kenneth Graunke	47b72990fe	i965: Fix execution size of scalar TCS barrier setup code. Previously, the scalar TCS backend was generating: mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; and(8) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all 1Q }; shl(8) g17.2<1>UD g17.2<8,8,1>UD 0x0000000bUD { align1 WE_all 1Q }; or(8) g17.2<1>UD g17.2<8,8,1>UD 0x00008200UD { align1 WE_all 1Q }; send(8) null<1>UW g17<8,8,1>UD gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q }; This is rubbish - g17.2<8,8,1>UD spans two registers, and is an illegal region. Not to mention it clobbers 8 channels of data when we only wanted to touch m0.2. Instead, we want: mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; and(1) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all }; shl(1) g17.2<1>UD g17.2<0,1,0>UD 0x0000000bUD { align1 WE_all }; or(1) g17.2<1>UD g17.2<0,1,0>UD 0x00008200UD { align1 WE_all }; send(8) null<1>UW g17<8,8,1>UD gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q }; Using component() accomplishes this. Fixes GL44-CTS.tessellation_shader.tessellation_shader_tc_barriers. barrier_guarded_read_write_calls on Skylake. Probably fixes other barrier issues on Gen8+. v2: Use a group(1, 0) builder so inst->exec_size is set correctly (thanks to Francisco Jerez for catching that it was incorrect). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1] Reviewed-by: Francisco Jerez <currojerez@riseup.net> (cherry picked from commit `159f037755`)	2016-09-01 11:39:42 +01:00
Kenneth Graunke	9cf5eb292b	i965: Implement the WaPreventHSTessLevelsInterference workaround. Fixes several GL44-CTS.tessellation_shader (and GL45 and ES31) subcases: - vertex_spacing - tessellation_shader_point_mode.points_verification - tessellation_shader_quads_tessellation.inner_tessellation_level_rounding Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit `9e778837ff`) [Emil Velikov: attribute for the lack of gl_linked_shader struct.] [Namely: s/tes->info./shader_prog->/;s/gl_linked_shader/gl_shader/] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/mesa/drivers/dri/i965/brw_tcs.c	2016-09-01 11:39:42 +01:00
Kenneth Graunke	6ea7a82b5e	nir/builder: Add bany_inequal and bany helpers. The first simply picks the bany_inequal[234] opcodes based on the SSA def's number of components. The latter implicitly compares with zero to achieve the same semantics of GLSL's any(). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit `d8971128ac`)	2016-09-01 11:39:42 +01:00
Kenneth Graunke	ab441496ca	mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case. GL_EXT_packed_float, 2.1.B Unsigned 10-Bit Floating-Point Numbers: 0.0, if E == 0 and M == 0, 2^-14 * (M / 32), if E == 0 and M != 0, 2^(E-15) * (1 + M/32), if 0 < E < 31, INF, if E == 31 and M == 0, or NaN, if E == 31 and M != 0, In the second case (E == 0 and M != 0), we were multiplying the mantissa by 2^-20, when we should have been multiplying by 2^-19 (which is 2^(-14 + -5), or 2^-14 * 2^-5, or 2^-14 / 32). The previous section defines the formula for 11-bit numbers, which is: 2^-14 * (M / 64), if E == 0 and M != 0, In other words, we had accidentally copy and pasted the 11-bit code to the 10-bit case, and neglected to change the exponent. Fixes dEQP-GLES3.functional.pbo.renderbuffer.r11f_g11f_b10f_triangles when run with surface dimensions of 1536x1152 or 1920x1080. Cc: mesa-stable@lists.freedesktop.org References: https://code.google.com/p/chrome-os-partner/issues/detail?id=56244 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com> Reviewed-by: Antia Puentes <apuentes@igalia.com> (cherry picked from commit `01e99cba04`)	2016-09-01 11:39:42 +01:00
Michel Dänzer	d1142926c2	glx: Don't use current context in __glXSendError There's no guarantee that there is one, and we don't need one anyway. Fixes piglit tests: glx@glx-fbconfig-bad glx@glx_ext_import_context@import context, multi process glx@glx_ext_import_context@import context, single process Fixes: `2e3f067458` ("glx: fix error code when there is no context bound") Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit `4ac640e3d2`)	2016-09-01 11:39:42 +01:00
Ilia Mirkin	8b76a3744c	nv50/ir: fix bb positions after exit instructions It's fairly rare that the BB layout puts BBs after the exit block, which is likely the reason these issues lingered for so long. This fixes a fraction of issues with the giant pixmark piano shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `e988999791`)	2016-09-01 11:39:42 +01:00
Dave Airlie	d279dec359	anv: fix writemask on blit fragment shader. I'm not sure if anything even uses this, but I found this on radv, so just fix it on anv for consistency. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `c2f2252037`)	2016-09-01 11:39:42 +01:00
Bernard Kilarski	c04ee8c110	glx: fix error code when there is no context bound v2: change all related NULL checks to check against dummyContext v3: really check for dummyContext only when ctx was from __glXGetCurrentContext v4: cover more checks, add dummyBuffer, dummyVtable (Emil) Signed-off-by: Bernard Kilarski <bernard.r.kilarski@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `2e3f067458`)	2016-09-01 11:39:42 +01:00
Ilia Mirkin	07df4bf0c8	nv50,nvc0: fix depth range when halfz is enabled Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `5c1ccd8053`)	2016-09-01 11:39:42 +01:00
Ilia Mirkin	32c009b116	gallium/util: add helper to compute zmin/zmax for a viewport state Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `c85b7f0e87`)	2016-09-01 11:39:42 +01:00
Ilia Mirkin	ea29312a79	vbo: allow DrawElementsBaseVertex in display lists Looks like it was missed originally. The multi version is there already. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97331 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `68b64f32e8`)	2016-09-01 11:39:42 +01:00
Kenneth Graunke	bc40bc5527	glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00. Old languages (GLSL <= 4.20 and GLSL ES 1.00) require "invariant" to be specified on both inputs and outputs, and match when linking. New languages only allow outputs to be qualified as "invariant" and remove the "invariant must match" restriction when linking varyings (because no input can have that qualifier). Commit `426a50e208` introduced the new behavior for ES 3.00. It also removed the "must match" restriction for ES 1.00 shaders, which I believe is incorrect. This patch adds that back, as well as making 4.30+ follow the new rules. Thanks to Qiankun Miao for noticing this discrepancy. Fixes a WebGL 2.0 conformance test when run in Chromium: https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2 Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96971 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `f9f462936a`)	2016-09-01 11:39:42 +01:00
Ian Romanick	0c25b3e4b0	glcpp: Only disallow #undef of pre-defined macros on GLSL ES >= 3.00 shaders Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says: It is an error to undefine or to redefine a built-in (pre-defined) macro name. The GLSL ES 1.00 spec does not contain this text. Section 3.3 (Preprocessor) of the GLSL 1.30 spec says: #define and #undef functionality are defined as is standard for C++ preprocessors for macro definitions both with and without macro parameters. At least as far as I can tell GCC allow '#undef __FILE__'. Furthermore, there are desktop OpenGL conformance tests that expect '#undef __VERSION__' and '#undef GL_core_profile' to work. Fixes: GL45-CTS.shaders.preprocessor.definitions.undefine_version_vertex GL45-CTS.shaders.preprocessor.definitions.undefine_version_fragment GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_vertex GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_fragment Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `50b49d242d`) Squashed with commit glcpp: Update tests for new #undef of built-in macro rules. Ian recently changed the preprocessor to allow this in most GLSL versions, but not GLSL ES 3.00+. This patch converts the existing test that expects a failure to a #version 300 es shader, and adds a #version 110 shader to make sure that it's allowed. Fixes 'make check'. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97307 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Tested-by: Vinson Lee <vlee@freedesktop.org> (cherry picked from commit `1f47f78fc3`)	2016-09-01 11:39:29 +01:00
Ian Romanick	e1698fa455	glcpp: Track the actual version instead of just the version_resolved flag Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `eda6349346`) [Emil Velikov: resolve trivial conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/compiler/glsl/glcpp/glcpp-parse.y	2016-09-01 10:06:24 +01:00
Jason Ekstrand	3dca5c8eb1	i965/vec4: Make opt_vector_float reset at the top of each block The pass isn't really control-flow aware and you can get into case where it tries to combine instructions from different blocks. This can actually lead to an assertion failure when removing unneeded instructions if part of the vector is set in one block and part in another. This prevents regressions in the next commit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `4c3a6b07e2`)	2016-09-01 10:06:24 +01:00
Marek Olšák	659d9f189c	radeonsi: only set dual source blending for MRT0 This is the proper fix for Overlord and Witcher 2 hangs. The hang condition is that 1 app must write to MRT0 and MRT1 from a pixel shader while MRT1 is disabled in CB_TARGET_MASK (does this generate unflushable pixel quads? I don't know), and another app (e.g. Glamor) must enable dual source blending in both MRT0 and MRT1. The hw gets confused, which leads to corruption and hangs. Cc: 12.0 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (cherry picked from commit `947e0614d0`)	2016-09-01 10:06:23 +01:00
Nicolai Hähnle	1959b57310	radeonsi: flush TC L2 cache for indirect draw data This fixes a bug when indirect draw data is generated by transform feedback. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `2852dedaa0`)	2016-09-01 10:06:23 +01:00
Kenneth Graunke	fce2e3b493	glsl: Fix location bias for patch variables. We need to subtract VARYING_SLOT_PATCH0, not VARYING_SLOT_VAR0. Since "patch" only applies to inputs and outputs, we can just handle this once outside the switch statement, rather than replicating the check twice and complicating the earlier conditions. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `398428f406`)	2016-09-01 10:06:23 +01:00
Kenneth Graunke	236172997c	glsl: Fix the program resource names of gl_TessLevelOuter/Inner[]. These are lowered to gl_TessLevel{Outer,Inner}MESA. We need them to appear in the program resource list with their original names and types. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `1556f16e46`)	2016-09-01 10:06:23 +01:00
Kenneth Graunke	cd009c46be	glsl: Delete bogus ir_set_program_inouts assert. This assertion is bogus. Varying structs, and arrays of structs, are allowed by GLSL, and we can see them here. While we currently don't have any partial-variable support for those, simply returning false and marking the entire thing as used is certainly legitimate. I believe this is often swept under the rug by varying packing, but that's disabled in certain tessellation situations. Hit by 20 dEQP-GLES31.functional.tessellation.user_defined_io.* tests. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `4a49851da1`)	2016-09-01 10:06:23 +01:00
Nanley Chery	f20168723f	anv/gen7_pipeline: Set PixelShaderKillPixel for discards According to the IVB PRM Vol2 P1, this bit must be set if a pixel shader contains a discard instruction. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97207 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `c495c18b24`)	2016-09-01 10:06:23 +01:00
Jan Ziak	6980f686d0	loader: fix memory leak in loader_dri3_open Found via "valgrind --leak-check=full glxgears". Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Acked-by: Boyan Ding <boyan.j.ding@gmail.com> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `fd32868590`)	2016-09-01 10:06:23 +01:00
Marek Olšák	9e22182223	gallium/util: fix align64 it cut off the upper 32 bits Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (cherry picked from commit `6db93cd167`)	2016-09-01 10:06:23 +01:00
Nicolas Boichat	e82567c02b	egl/dri2: Add reference count for dri2_egl_display android.opengl.cts.WrapperTest#testGetIntegerv1 CTS test calls eglTerminate, followed by eglReleaseThread. A similar case is observed in this bug: https://bugs.freedesktop.org/show_bug.cgi?id=69622, where the test calls eglTerminate, then eglMakeCurrent(dpy, NULL, NULL, NULL). With the current code, dri2_dpy structure is freed on eglTerminate call, so the display is not initialized when eglReleaseThread calls MakeCurrent with NULL parameters, to unbind the context, which causes a a segfault in drv->API.MakeCurrent (dri2_make_current), either in glFlush or in a latter call. eglTerminate specifies that "If contexts or surfaces associated with display is current to any thread, they are not released until they are no longer current as a result of eglMakeCurrent." However, to properly free the current context/surface (i.e., call glFlush, unbindContext, driDestroyContext), we still need the display vtbl (and possibly an active dri dpy connection). Therefore, we add some reference counter to dri2_egl_display, to make sure the structure is kept allocated as long as it is required. One drawback of this is that eglInitialize may not completely reinitialize the display (if eglTerminate was called with a current context), however, this seems to meet the EGL spec quite well, and does not permanently leak any context/display even for incorrectly written apps. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `9ee683f877`) Squashed with commit egl/dri2: dri2_make_current: Release previous context's display eglMakeCurrent can also be used to change the active display. In that case, we need to decrement ref_count of the previous display (possibly destroying it), and increment it on the next display. Also, old_dsurf/old_rsurf cannot be non-NULL if old_ctx is NULL, so we only need to test if old_ctx is non-NULL. v2: Save the old display before destroying the context. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97214 Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reported-by: Alexandr Zelinsky <mexahotabop@w1l.ru> Tested-by: Alexandr Zelinsky <mexahotabop@w1l.ru> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> (cherry picked from commit `78e3cea419`) Squashed with commit egl/dri2: dri2_initialize: Do not reference-count TestOnly display In the case where dri2_initialize is called with a TestOnly display, the display is not actually initialized, so dri2_egl_display always fails, and we cannot do any reference counting. Fixes piglit spec@egl_khr_create_context@verify gl flavor (reproducible with LIBGL_ALWAYS_SOFTWARE=1). Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reported-by: Michel Dänzer <michel@daenzer.net> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `4f3f8bb59d`)	2016-09-01 10:05:40 +01:00
Nicolas Boichat	6503a05f35	egl/android: Set dpy->DriverData to NULL on error Avoid use-after-free on error. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `c0580f6a38`)	2016-09-01 10:05:40 +01:00
Nicolas Boichat	447501914a	egl/drm: Set disp->DriverData to NULL on error Avoid use-after-free on error. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `a9e8fb7397`)	2016-09-01 10:05:40 +01:00
Nicolas Boichat	f8a5d340e8	egl/surfaceless: Set disp->DriverData to NULL on error Avoid use-after-free on error. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `0e67d86540`)	2016-09-01 10:05:39 +01:00
Nicolas Boichat	6ec0c92b3c	egl/wayland: Set disp->DriverData to NULL on error Avoid use-after-free, fix spec@egl_khr_fence_sync@conformance. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reported-by: Michel Dänzer <michel@daenzer.net> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `48fd952f28`)	2016-09-01 10:05:39 +01:00
Jan Ziak	fbde508c18	egl/x11: avoid using freed memory if dri2 init fails Found with valgrind: ==4841== Invalid read of size 4 ==4841== at 0x56BDC80: dri2_initialize (egl_dri2.c:783) ==4841== by 0x56BAFE5: _eglMatchAndInitialize (egldriver.c:261) ==4841== by 0x56BB15E: _eglMatchDriver (egldriver.c:295) ==4841== by 0x56B58C9: eglInitialize (eglapi.c:480) ==4841== by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x402E59: main ==4841== Address 0x6a05824 is 148 bytes inside a block of size 480 free'd ==4841== at 0x4C2B680: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==4841== by 0x56C2AAE: dri2_initialize_x11_swrast (platform_x11.c:1233) ==4841== by 0x56C2AAE: dri2_initialize_x11 (platform_x11.c:1493) ==4841== by 0x56BDCEB: dri2_initialize (egl_dri2.c:805) ==4841== by 0x56BAFAF: _eglMatchAndInitialize (egldriver.c:261) ==4841== by 0x56BB0C9: _eglMatchDriver (egldriver.c:292) ==4841== by 0x56B58C9: eglInitialize (eglapi.c:480) ==4841== by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x402E59: main ==4841== Block was alloc'd at ==4841== at 0x4C2A868: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==4841== by 0x56C2A47: dri2_initialize_x11_swrast (platform_x11.c:1171) ==4841== by 0x56C2A47: dri2_initialize_x11 (platform_x11.c:1493) ==4841== by 0x56BDCEB: dri2_initialize (egl_dri2.c:805) ==4841== by 0x56BAFAF: _eglMatchAndInitialize (egldriver.c:261) ==4841== by 0x56BB0C9: _eglMatchDriver (egldriver.c:292) ==4841== by 0x56B58C9: eglInitialize (eglapi.c:480) ==4841== by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x402E59: main Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `769ac1ec78`)	2016-09-01 10:05:39 +01:00
Nicolai Hähnle	178b889823	glsl: fix optimization of discard nested multiple levels The order of optimizations can lead to the conditional discard optimization being applied twice to the same discard statement. In this case, we must ensure that both conditions are applied. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96762 Cc: mesa-stable@lists.freedesktop.org Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `21556d86fc`) [Emil Velikov: s/get_head_raw()/head/] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/compiler/glsl/opt_conditional_discard.cpp	2016-07-28 17:05:28 +01:00
Nicolai Hähnle	7208d82dfb	st_glsl_to_tgsi: only skip over slots of an input array that are present When an application declares varying arrays but does not actually do any indirect indexing, some array indices may end up unused in the consuming shader, so the number of input slots that correspond to the array ends up less than the array_size. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `185b0c15ab`)	2016-07-28 14:51:32 +01:00
Jason Ekstrand	be0344f630	i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations intel_mipmap_tree::logical_depth0 is now in number of 2D slices so we no longer need to be multiplying by 6. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `5d76690f17`)	2016-07-28 14:50:32 +01:00
Nicolai Hähnle	6156d8d93e	radeonsi: ensure sample locations are set for line and polygon smoothing Since commit `d938b8c`, the sample locations are no longer set unconditionally, so we need to set the atom to dirty on all chips, not just Polaris. Cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `3d69357da9`)	2016-07-28 14:49:12 +01:00
Nicolai Hähnle	3237c07e98	radeonsi: fix Polaris MSAA regression The regression was introduced by commit `d938b8c`. The problem here is that in order to use the small primitive filter, we need to explicitly set the sample locations to 0. But the DB doesn't properly process the change of sample locations without a flush, and so we can end up with incorrect Z values. Instead of doing a flush, just disable the small primitive filter when MSAA is force-disabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96908 Cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `f755da0f2f`)	2016-07-28 14:47:49 +01:00
Kenneth Graunke	5f09454e34	mesa: Don't call GenerateMipmap if Width or Height == 0. One of the WebGL 2.0 conformance tests is trying to call glGenerateMipmaps with a width and height of 0. With the meta implementation, this generates a "framebuffer attachment incomplete" status, and falls back to the CPU path, calling MapTextureImage. Except that there's no actual texture to map, and we assert fail. There's no work to do in this case. The test expects it to succeed, so just return early with no error and avoid hassling the driver. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96911 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (cherry picked from commit `f80bea2d80`)	2016-07-28 14:46:50 +01:00
Jason Ekstrand	1951df7812	anv/pipeline: Set up point coord enables Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b33bccb519`)	2016-07-28 14:45:48 +01:00
Kenneth Graunke	2f8cd4a8c3	mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats. The GL_EXT_texture_format_BGRA8888 extension specification defines a GL_BGRA_EXT unsized internal format (which is a little odd - usually BGRA is a pixel transfer format). The extension is written against the ES 1.0 specification, so it's a little hard to map, but I believe it's effectively adding it to the table used here, so we should allow it here as well. Note that GL_EXT_texture_format_BGRA8888 is always enabled (dummy_true), so we don't need to check if it's enabled here. This fixes mipmap generation in Skia and ChromeOS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> References: https://bugs.chromium.org/p/chromium/issues/detail?id=630371 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reported-by: Stéphane Marchesin <marcheu@chromium.org> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `cb70773129`)	2016-07-28 14:44:45 +01:00
Kenneth Graunke	2f80fb368b	i965: Fix shared atomic intrinsics to pay attention to base. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `76e161056a`)	2016-07-28 14:43:35 +01:00
Kenneth Graunke	e1c20919a8	nir: Add a base const_index to shared atomic intrinsics. Commit `52e75dcb8c` made nir_lower_io start using nir_intrinsic_set_base instead of writing const_index[0] directly. However, those intrinsics apparently don't /have/ a base, so this caused assert failures. However, the old code was happily setting non-existent const_index fields, so it was pretty bogus too. Jason pointed out that load_shared and store_shared have a base, and that the i965 driver uses that field. So presumably atomics should have one as well, so that loads/stores/atomics all refer to variables with consistent addressing. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `cf6f2d3ce7`)	2016-07-28 14:40:24 +01:00
Kenneth Graunke	a94056e2c7	i965: Include VUE handles for GS with invocations > 1. We always resort to the pull model for instanced GS inputs. So, we'd better include the VUE handles, or else we can't actually pull anything. Ian reports that on his branch with OES_geometry_shader enabled, this fixes a bunch of dEQP-GLES31.functional.geometry_shading tests:: - instanced.draw_2_instances_geometry_2_invocations - instanced.draw_2_instances_geometry_8_invocations - instanced.draw_4_instances_geometry_2_invocations - instanced.draw_4_instances_geometry_8_invocations - instanced.draw_8_instances_geometry_2_invocations - instanced.draw_8_instances_geometry_8_invocations - instanced.geometry_2_invocations - instanced.geometry_32_invocations - instanced.geometry_8_invocations - instanced.geometry_max_invocations - instanced.geometry_output_different_2_invocations - instanced.geometry_output_different_32_invocations - instanced.geometry_output_different_8_invocations - instanced.geometry_output_different_max_invocations - instanced.invocation_output_vary_by_attribute - instanced.invocation_output_vary_by_texture - instanced.invocation_output_vary_by_uniform - query.primitives_generated_instanced Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `2db357e4c3`)	2016-07-28 14:21:29 +01:00
Chuck Atkins	d70f97784b	swr: Refactor checks for compiler feature flags Encapsulate the test for which flags are needed to get a compiler to support certain features. Along with this, give various options to try for AVX and AVX2 support. Ideally we want to use specific instruction set feature flags, like -mavx2 for instance instead of -march=haswell, but the flags required for certain compilers are different. This allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c while the Intel compiler which doesn't support those flags can fall back to using -march=core-avx2. This addresses a bug where the Intel compiler will silently ignore the AVX2 instruction feature flags and then potentially fail to build. v2: Pass preprocessor-check argument as true-state instead of false-state for clarity. v3: Reduce AVX2 define test to just __AVX2__. Additional defines suchas __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined w.r.t thier availability. v4: Fix C++11 flags being added globally and add more logic to swr_require_cxx_feature_flags Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com> Tested-by: Tim Rowley <timothy.o.rowley@Intel.com> Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> (cherry picked from commit `c1bf6692be`)	2016-07-27 15:15:34 +01:00
Tim Rowley	cd10b86026	swr: switch from overriding -march to selecting features Acked-by: Chuck Atkins <chuck.atkins@kitware.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com> (cherry picked from commit `5a64549f54`)	2016-07-27 15:14:57 +01:00
Marek Olšák	3b4c74963a	winsys/amdgpu: disallow DCC with mipmaps It has never been implemented. master will get a different fix. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96381 Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>	2016-07-27 14:37:01 +01:00
Samuel Pitoiset	faa432c0b6	nvc0: upload sample locations on GM20x This fixes a bunch of multisample piglit tests on GM206, like bin/arb_texture_multisample-texelfetch 2 -auto -fbo Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `e7b2ce5fd8`) [Emil Velikov: resolve conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c	2016-07-27 14:21:10 +01:00
Rob Herring	271eee2464	Android: add missing u_math.h include path for libmesa_isl Commit `87d062a940` ("i965: Fix shared local memory size for Gen9+.") added u_math.h include which broke the Android build: In file included from external/mesa3d/src/intel/isl/isl_storage_image.c:25: In file included from external/mesa3d/src/mesa/drivers/dri/i965/brw_compiler.h:29: external/mesa3d/src/mesa/main/macros.h:35:10: fatal error: 'util/u_math.h' file not found ^ Add the missing include paths for libmesa_isl. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Kenneth Garunke <kenneth@whitecape.org> Nominated-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `789ed13284`)	2016-07-27 14:06:27 +01:00
Matt Turner	fd2312e745	mapi: Massage code to allow clang to compile. According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code was violating the spec, resulting in it failing to compile. Cc: mesa-stable@lists.freedesktop.org Co-authored-by: Tomasz Paweł Gajc <tpgxyz@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89599 Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `5ec140c17b`) Squashed with commit: mapi: fix typo in macro name Fixes: `5ec140c17b` ("mapi: Massage code to allow clang to compile.") Reported-by: Alexandre Demers <alexandre.f.demers@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> (cherry picked from commit `4da9f7e7ce`)	2016-07-27 11:07:53 +01:00
Jason Ekstrand	67032af87a	nir/inline: Constant-initialize local variables in the callee if needed Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `9d503aea06`)	2016-07-21 16:08:59 +01:00
Jason Ekstrand	f9964dd2c6	nir: Add a nir_deref_foreach_leaf helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `dc9f2436c3`)	2016-07-21 16:06:08 +01:00
Kenneth Graunke	1c9412bb1a	anv: Properly call gen75_emit_state_base_address on Haswell. This should fix MOCS values. Caught by Coverity. CID: 1364155 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `e614062e54`)	2016-07-21 16:05:12 +01:00
Kenneth Graunke	726f47e495	genxml: Rename "API Rendering Disable" to "Rendering Disable". Gen7/7.5 call it "Rendering Disable" while Gen8/9 prefix it with "API". Pick one for consistency, and so we can share code between generations. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `87660579f5`)	2016-07-21 16:04:04 +01:00
Kenneth Graunke	1dd0c22ab0	anv: Unify 3DSTATE_CLIP code across generations. The bulk of this is the same. There are just a couple fields that only exist on one generation or another, and we can easily handle those with an #ifdef. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `bfd9942cdc`)	2016-07-21 16:03:03 +01:00
Kenneth Graunke	a00081a9e8	anv: Enable early culling on Gen7. We set the cull mode, but forgot the enable bit. Gen8 uses this. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `44502afd82`)	2016-07-21 16:02:06 +01:00
Kenneth Graunke	626f21051b	anv: Fix near plane clipping on Gen7/7.5. The Gen7/7.5 clip code used APIMODE_OGL, while the Gen8+ clip code used APIMODE_D3D. The meaning hasn't changed, so one of these must be wrong. It appears that the hardware documentation is completely wrong. It claims that the "API Mode" bit means: 0h APIMODE_OGL NEAR_VP boundary == 0.0 (NDC) 1h APIMODE_D3D NEAR_VP boundary == -1.0 (NDC) However, DirectX typically uses 0.0 for the near plane, while unextended OpenGL uses -1.0. i965's gen6_clip_state.c uses APIMODE_D3D for the GL_ZERO_TO_ONE case, so I believe the meanings are backwards from what the documentation says. Section 23.2 ("Primitive Clipping") of the Vulkan 1.0.21 specification contains the following equations: -w_c <= x_c <= w_c -w_c <= y_c <= w_c 0 <= z_c <= w_c This means that Vulkan follows D3D semantics. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `0d77f08042`)	2016-07-21 16:01:05 +01:00
Kenneth Graunke	629a7b32e0	genxml: Add APIMODE_D3D missing enum values and improve consistency. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `6b67270262`)	2016-07-21 15:59:45 +01:00
Kenneth Graunke	749e4cb96b	genxml: Add CLIPMODE_* prefix to 3DSTATE_CLIP's "Clip Mode" enum values. Gen6-7.5 use CLIPMODE_REJECT_ALL, while Gen8+ just used REJECT_ALL. Being consistent will let me unify code, and I prefer having the prefix. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `c31cf532af`)	2016-07-21 15:58:37 +01:00
Jason Ekstrand	48e9ecc47f	i965/miptree: Set logical_depth0 == 6 for cube maps This matches what we do for cube maps where logical_depth0 is in number of face-layers rather than number of cubes. This does mean that we will temporarily be setting the surface bounds too loose for cube map textures but we are already setting them too loose for cube arrays and we will be fixing that in the next commit anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "12.0 11.2 11.1" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `e19b7f7f1b`)	2016-07-21 15:57:39 +01:00
Jason Ekstrand	57c1d0ea07	i965/miptree: Enforce that height == 1 for 1-D array textures The GL API and mesa internals do this differently than we do. In GL, there is no depth parameter for 1-D arrays and height is used. In the i965 miptree code we do the sane thing and make height == 1 and use depth for number of slices. This makes for a mismatch every time we create a 1-D array texture from GL. Instead of actually solving this problem, we just said "1-D is hard, let's make sure it works no matter which way we pass the parameters" and called it a day. This commit fixes the one GL -> i965 transition point where we weren't already handling 1-D array textures to do the right thing and then replaces the magic fixup code with an assert that you're doing the right thing. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "12.0 11.2 11.1" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `d4d505d0b0`)	2016-07-21 14:32:33 +01:00
Stefan Dirsch	fb8b548ac1	Avoid overflow in 'last' variable of FindGLXFunction(...) This 'last' variable used in FindGLXFunction(...) may become negative, but has been defined as unsigned int resulting in an overflow, finally resulting in a segfault when accessing _glXDispatchTableStrings[...]. Fixed this by definining it as signed int. 'first' variable also needs to be defined as signed int. Otherwise condition for while loop fails due to C implicitly converting signed to unsigned values before comparison. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Stefan Dirsch <sndirsch@suse.de> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `27ef7bfd6c`)	2016-07-21 14:31:27 +01:00
Tomasz Figa	92799eee93	egl/android: Stop leaking DRI images Current implementation of the DRI image loader does not free the images created in get_back_bo() and so leaks memory. Moreover, it creates a new image every time the DRI driver queries for buffers, even if the backing native buffer has not changed. leaking memory again. This patch adds missing call to destroyImage() in droid_enqueue_buffer() and a check if image is already created to get_back_bo() to fix the above. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `9e1248d075`)	2016-07-21 14:29:40 +01:00
Tomasz Figa	b717b49286	egl/android: Check return value of dri2_get_dri_config() It might return NULL if specific config variant is unsupported. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `94282b6dd0`)	2016-07-21 14:29:40 +01:00
Emil Velikov	bb0c49bf76	i965: store reference to the context within struct brw_fence (v2) As the spec allows for {server,client}_wait_sync to be called without currently bound context, while our implementation requires context pointer. v2: Add a mutex and acquire it for the duration of brw_fence_client_wait() and brw_fence_is_completed() as suggested by Chad. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Tomasz Figa <tfiga@chromium.org> (cherry picked from commit `4f48674d51`)	2016-07-21 14:29:40 +01:00
Nicolas Boichat	de695014eb	egl/dri2: dri2_make_current: Set EGL error if bindContext fails Without this, if a configuration is, say, available only on GLES2/3, but not on GLES1, and is rejected by the dri module's bindContext call, eglMakeCurrent fails with error "EGL_SUCCESS". In this patch, we set error to EGL_BAD_MATCH, which is what CTS/dEQP dEQP-EGL.functional.surfaceless_context expect. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `9bebef4034`)	2016-07-21 14:29:40 +01:00
Tomasz Figa	67e04622d8	egl/android: Remove unused variables There are some unused variables left after previous clean-ups triggering compiler warnings. Let's remove them. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `ccda100a5a`)	2016-07-21 14:29:40 +01:00
Haixia Shi	fde7fc1ab2	platform_android: prevent deadlock in droid_swap_buffers To avoid blocking other EGL calls, release the display mutex before we enqueue buffer to android frameworks and re-acquire the mutex upon return. v2: moved lock/unlock inside droid_window_enqueue_buffer(). TEST=verify pinch zoom in Photos app no longer causes hangs Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `1ea233c6f3`)	2016-07-21 14:29:40 +01:00
Tomasz Figa	abaf0e9817	gallium/dri: Add shared glapi to LIBADD on Android An earlier patch fixed the problem for classic drivers, however Gallium was still left broken. This patch applies the same workaround to Gallium, when compiled for Android. Following is a quote from the original patch: `0cbc90c57c` mesa: dri: Add shared glapi to LIBADD on Android /system/vendor/lib/dri/*_dri.so actually depend on libglapi: without this, loading the so file fails with: cannot locate symbol "__emutls_v._glapi_tls_Context" On non-Android (non-bionic) platform, EGL uses the following workflow, which works fine: dlopen("libglapi.so", RTLD_LAZY \| RTLD_GLOBAL); dlopen("dri/<driver>_dri.so", RTLD_NOW \| RTLD_GLOBAL); However, bionic does not respect the RTLD_GLOBAL flag, and the dri library cannot find symbols in libglapi.so, so we need to link to libglapi.so explicitly. Android.mk already does this. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `70a28afb29`)	2016-07-21 14:01:15 +01:00
Emil Velikov	485c6d231e	mesa: scons: list builddir before srcdir Analogous to previous commit. Note: scons always uses OOT builds, while the in-tree generated files could be created either manually or by the autoconf build. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `1c7c0d77ac`) [Emil Velikov: remove not-applicable "Dir('main')" entry] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/mesa/SConscript	2016-07-21 13:59:25 +01:00
Emil Velikov	1f3ce210cd	mesa: automake: list builddir before srcdir In the case of building in out-of-tree fashion, while having generated in-tree sources, the latter [likely stale] files will be used. Flip the order to prevent that. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `eafa82e20e`)	2016-07-21 13:53:16 +01:00
Samuel Pitoiset	f123f574fa	gm107/ir: make use of ADD32I for all immediates ADD only allows to emit 19-bits immediates. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `9c63224540`)	2016-07-21 13:51:55 +01:00
Samuel Pitoiset	bbb0587c78	gm107/ir: add missing NEG modifier for IADD32I Like FADD32I, the NEG modifier of src0 is at position 56. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `0904a2ba97`)	2016-07-21 13:50:33 +01:00
Andreas Boll	fb4ab871a8	configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too The help string wasn't updated in `cbc37f7`. Fixes: `cbc37f7` ("anv: install the intel_icd.json to ${datarootdir} by default") Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `d66cb7c84f`)	2016-07-21 13:49:34 +01:00
Ilia Mirkin	53bb4e0354	nv50,nvc0: srgb rendering is only available for rgba/bgra Mark both L8_SRGB and L8A8_SRGB as non-renderable (the latter already didn't have the bind flags). This makes the state tracker pick a different format when rendering is required, or mark the fb as incomplete. This fixes: bin/getteximage-formats init-by-clear-and-render -auto -fbo bin/getteximage-formats init-by-rendering -auto -fbo which previously ran into srgb-encoding differences. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `ed9dd3bcd9`)	2016-07-21 13:48:16 +01:00
Leo Liu	aeb3ca9754	vl/dri3: fix a memory leak from front buffer Inspired by fix for mem leak of vdpau interop, resource_from_handle set texture reference count, that need to be decreased and released, recall there is a similar case for DRI3, that is with VA-API glx extension, there is temporary TFP(texture from pixmap), we target it through dma-buf. leak happens when without count down the reference. Checked and found with mpv vo=opengl case, there only one static TFP, the leak happens once, but for totem player using gstreamer VA-API glx, the dynamic TFP for each frame, so leak quite a bit. This fixes mem leak for mpv and totem. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `134d6e4e4f`)	2016-07-21 13:46:55 +01:00
Jason Ekstrand	b1b601fc7c	anv: Handle VK_WHOLE_SIZE properly for buffer views The old calculation, which used view->offset, encorporated buffer->offset into the size calculation where it doesn't belong. This meant that, if buffer->offset > buffer->size, you would always get a negative size. This fixes 170 dEQP-VK.renderpass.attachment.* Vulkan CTS tests on Haswell. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `593731ea3c`) [Emil Velikov: s\|bpb / 8\|bs\|g] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/intel/vulkan/anv_image.c	2016-07-21 13:46:02 +01:00
Jason Ekstrand	7441632753	anv: Add an align_down_npot_u32 helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `827405f072`)	2016-07-21 12:31:28 +01:00
Jason Ekstrand	56d6f64206	anv: Enable independentBlend on gen7 We can totally do it, we were just only setting up one BLEND_STATE and, now that the code is unified with gen8, we should be handling it correctly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `f124f4a394`)	2016-07-21 12:30:32 +01:00
Jason Ekstrand	f194c84b37	anv/pipeline: Unify blend state setup between gen7 and gen8 This fixes all 674 broken dEQP-VK.pipeline.blend Vulkan CTS tests on Haswell. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a2e7b2e653`)	2016-07-21 12:29:34 +01:00
Jason Ekstrand	cb9a2a4b85	genxml: Make gen6-7 blending look more like gen8 This renames BLEND_STATE to BLEND_STATE_ENTRY and adds an new struct BLEND_STATE which is just an array of 8 BLEND_STATE_ENTRYs. This will make it much easier to write gen-agnostic blend handling code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `aaa202ebe7`)	2016-07-21 12:28:20 +01:00
Brian Paul	9798eb14da	mesa: use _mesa_clear_texture_image() in clear_texture_fields() This avoids a failed assert(img->_BaseFormat != -1) in init_teximage_fields_ms() because the internalFormat argument is GL_NONE. This was hit when using glTexStorage() to do a proxy texture test. Fixes a failure with the updated Piglit tex3d-maxsize test. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (cherry picked from commit `e477d92c94`)	2016-07-21 12:27:27 +01:00
Nanley Chery	60e41ca10a	isl: Fix isl_tiling_is_any_y() Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `00caba4152`)	2016-07-21 12:25:54 +01:00
Nanley Chery	826117a1c4	anv/device: Fix max buffer range limits Set limits that are consistent with ISL's assertions in isl_genX(buffer_fill_state_s)() and Anvil's format-DescriptorType mapping in anv_isl_format_for_descriptor_type(). Fixes the following new crucible tests: * stress.limits.buffer-update.range.uniform * stress.limits.buffer-update.range.storage These tests are in this patch: https://patchwork.freedesktop.org/patch/98726/ Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `a5748cb920`)	2016-07-21 12:24:55 +01:00
Nanley Chery	f334b6deaa	isl: Fix assert on raw buffer surface state size See inline PRM reference. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `028f6d8317`)	2016-07-21 12:23:58 +01:00
Nanley Chery	30c6bff143	anv/descriptor_set: Fix binding partly undefined descriptor sets Section 13.2.3. of the Vulkan spec requires that implementations be able to bind sparsely-defined Descriptor Sets without any errors or exceptions. When binding a descriptor set that contains a dynamic buffer binding/descriptor, the driver attempts to dereference the descriptor's buffer_view field if it is non-NULL. It currently segfaults on undefined descriptors as this field is never zero-initialized. Zero undefined descriptors to avoid segfaulting. This solution was suggested by Jason Ekstrand. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `fd16e64321`)	2016-07-21 12:23:00 +01:00
Brian Paul	474b169c1f	svga: handle mismatched number of samplers, sampler views in svga_init_shader_key_common(). Since the CSO module only tracks sampler views for fragment shaders, the number of samplers and sampler views can be mismatched for other types of shaders. This situation triggered an assertion in Chrome with maps.google.com This patch adds defensive code to handle that situation. Fixes VMware bug 1694027 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Charmaine Lee <charmainel@vmware.com> (cherry picked from commit `50a669de4e`)	2016-07-21 12:21:57 +01:00
Leo Liu	6deeccf5aa	st/omx/enc: check uninitialized list from task release The uninitialized list should be checked and returned. Thank Julien for the notification and suggested fix. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b9d10e79c8`)	2016-07-21 12:20:56 +01:00
Jason Ekstrand	bf11931c95	glsl/types: Use _mesa_hash_data for hashing function types This is way better than the stupid string approach especially since you could overflow the string. Again, I thought I had something better at one point but it obviously got lost. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b919100d61`)	2016-07-21 12:19:38 +01:00
Jason Ekstrand	17e1b016fc	glsl/types: Fix function type comparison function It was returning true if the function types have different lengths rather than false. This was new with the SPIR-V to NIR pass and I thought I'd fixed it a while ago but it may have gotten lost in rebasing somewhere. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `11ac1c4dbb`)	2016-07-21 12:18:19 +01:00
Christian König	b0d1395480	st/mesa: fix reference counting bug in st_vdpau Otherwise we leak the resources created for the DMA-buf descriptors. Signed-off-by: Christian König <christian.koenig@amd.com> Cc: 12.0 <mesa-stable@lists.freedesktop.org> Tested-and-Reviewed by: Leo Liu <leo.liu@amd.com> Ack-by: Tom St Denis <tom.stdenis@amd.com> (cherry picked from commit `9ce52baf7f`)	2016-07-21 12:17:21 +01:00
Jason Ekstrand	ad09c8142f	anv: Add a stub for CmdCopyQueryPoolResults on Ivy Bridge Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b9e99282a6`)	2016-07-21 12:16:14 +01:00
Tim Rowley	2e010ab1cc	Revert "gallium: Force blend color to 16-byte alignment" This reverts commit `d8d6091a84`. Heap allocations may be only 8-byte aligned on 32-bit system, and so having members with 16-byte alignment (such as in the case where pipe_blend_color is embedded in radeonsi's si_context) is undefined behavior which indeed causes crashes when compiled with gcc -O3. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96835 Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com> Acked-by: Chuck Atkins <chuck.atkins@kitware.com> (cherry picked from commit `29f53d7937`)	2016-07-21 12:08:18 +01:00
Jason Ekstrand	0aae486a8b	nir/spirv: Don't multiply the push constant block size by 4 I have no idea why we were multiplying by 4 before. The offsets we get from SPIR-V are in bytes and so is nir->num_uniforms so there's no need to do any adjustment whatsoever. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `49476576dd`)	2016-07-21 12:07:01 +01:00
Marek Olšák	605063953d	radeonsi: add a workaround for a compute VGPR-usage LLVM bug v2: use abort(), describe which LLVM version is affected Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `d227dbe272`)	2016-07-21 12:05:48 +01:00
Marek Olšák	cbdbf67f1c	glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast This bug is uncovered by glsl/lower_if_to_cond_assign. I don't know if it can be reproduced in any other way. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `ead7736821`)	2016-07-21 12:04:46 +01:00
Ilia Mirkin	a2aec66444	mesa: set _NEW_BUFFERS when updating texture bound to current buffers When a glTexImage call updates the parameters of a currently bound framebuffer, we might miss out on revalidating whether it is complete. Make sure to set _NEW_BUFFERS which will trigger the revalidation in that case. Also while we're at it, fix the fb parameter passed in to the eventual RenderTexture call. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94148 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr> (cherry picked from commit `da7223ebdc`)	2016-07-21 12:03:44 +01:00
Ilia Mirkin	1ed8237b1a	st/mesa: return appropriate mesa format for ETC texture formats Even when the backend driver does not support ETC formats, we handle the decoding into an uncompressed backing texture. However as far as core mesa is concerned, it's an ETC texture and we should return the relevant ETC mesa format. This condition can get hit when using glTexStorage to create the texture object. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `00d4315d37`)	2016-07-21 12:02:43 +01:00
Ilia Mirkin	e7de53fefd	mesa: etc2 online compression is unsupported, don't attempt it Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `8ee3cdde04`)	2016-07-21 12:00:55 +01:00
Samuel Pitoiset	76a2950c1e	nvc0: fix the driver cb size when draw parameters are used The size of the driver constant buffer for each stage should be 2048 and not 512 because it has been increased recently for buffers/images. While we are at it, do the same change for indirect draws. This fixes all ARB_shader_draw_parameters tests on GM107. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `31a615677b`)	2016-07-21 11:59:33 +01:00
Samuel Pitoiset	d6c387933d	nvc0/ir: fix images indirect access on Fermi This fixes the following piglits: arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index2 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `19d0450b27`)	2016-07-21 11:57:57 +01:00
Nicolai Hähnle	60eabe9ad3	radeonsi: explicitly choose center locations for 1xAA on Polaris Unlike SC, the small primitive filter does not automatically use center locations in 1xAA mode, so this is needed to avoid artifacts caused by the small primitive filter discarding triangles that it shouldn't. As a side effect of how the effective number of samples is now calculated, this patch also avoids submitting the sample locations for line/poly smoothing when they're not really needed. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `d938b8c0bf`)	2016-07-21 11:56:25 +01:00
Francisco Jerez	04584d5835	i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush. This hardware race condition has caused problems several times already (see "i965: Fix cache pollution race during L3 partitioning set-up.", "i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs." and "i965: intel_texture_barrier reimplemented"). The problem is that whenever we attempt to both flush and invalidate multiple caches with a single pipe control command the flush and invalidation happen in reverse order, so the contents flushed from the R/W caches aren't guaranteed to become visible from the invalidated caches after the PIPE_CONTROL command completes execution if some concurrent rendering workload happened to pollute any of the invalidated R/O caches in the short window of time between the invalidation and flush. This makes sure that brw_emit_pipe_control_flush() has the effect expected by most callers of making the contents flushed from any R/W caches visible from the invalidated R/O caches. Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `37b901003b`)	2016-07-21 11:54:41 +01:00
Francisco Jerez	2035ac24ce	i965: Make room in the batch epilogue for three more pipe controls. Review carefully, it sucks to have to keep track of the number of command packet dwords emitted in the batch epilogue manually. The MI_REPORT_PERF_COUNT_BATCH_DWORDS calculation was obviously wrong. Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `0bd3a121c6`)	2016-07-21 11:52:56 +01:00
Francisco Jerez	72d287e347	i965: Emit SKL VF cache invalidation W/A from brw_emit_pipe_control_flush. There were two places in the driver doing a pipe control VF cache flush, one of them was missing this workaround, move it down into brw_emit_pipe_control_flush to make sure we don't miss it again. Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit `a10879f48c`)	2016-07-21 11:51:53 +01:00
Ian Romanick	3a35da7e8a	glsl: Pack integer and double varyings as flat even if interpolation mode is none v2: Also update varying_matches::compute_packing_class(). Suggested by Timothy Arceri. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `3119871bd9`)	2016-07-21 11:48:58 +01:00
Ian Romanick	bc68532a06	mesa: Strip arrayness from interface block names in some IO validation Outputs from the vertex shader need to be able to match per-vertex-arrayed inputs of later stages. Acomplish this by stripping one level of arrayness from the names and types of outputs going to a per-vertex-arrayed stage. v2: Add missing checks for TESS_EVAL->GEOMETRY. Noticed by Timothy Arceri. v3: Use a slightly simpler stage check suggested by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `73a6a4ce49`)	2016-07-21 11:30:46 +01:00
Emil Velikov	edfc17a19a	docs: add sha256 checksums for 12.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-09 00:02:13 +01:00
Emil Velikov	04277f058d	docs: add release notes for 12.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-08 23:48:58 +01:00
Emil Velikov	a2770e55a2	Update version to 12.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-08 23:48:11 +01:00
Emil Velikov	a705f82a56	radeon: reference the correct cdw/max_dw With commit `f41f78cda1` ("radeonsi: drop the DRAW_PREAMBLE packet on Polaris") we failed to attribute that the separate current/prev radeon_winsys_cs_chunk(s) are not applicable/available in branch. The latter of which introduced with commit `89ba076de4` ("radeon/winsys: introduce radeon_winsys_cs_chunk"). Just drop "current." from the respective places to get things up and running again. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96864 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-08 23:48:11 +01:00
Emil Velikov	3a146a789c	docs: add sha256 checksums for 12.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-08 23:48:11 +01:00
Emil Velikov	8b06176f31	docs: Update 12.0.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-08 14:45:24 +01:00
Emil Velikov	ab8938817f	Update version to 12.0.0(final) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-08 14:43:18 +01:00
Neha Bhende	88a095962f	svga: Fix failures caused in fedora 24 SVGA_3D_CMD_DX_GENRATE_MIPMAP & SVGA_3D_CMD_DX_SET_PREDICATION commands are not presents in fedora 24 kernel module. Because of this reason application like supertuxkart are not running. v2: Add few comments and code modifications suggested by Brian P. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> (cherry picked from commit `7988513ac3`)	2016-07-08 14:29:58 +01:00
Ilia Mirkin	450f076482	glsl: don't try to lower non-gl builtins as if they were gl_FragData If a shader has an output array, it will get treated as though it were gl_FragData and rewritten into gl_out_FragData instances. We only want this to happen on the actual gl_FragData and not everything else. This is a small part of the problem pointed out by the below bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a37e46323c`)	2016-07-08 14:29:50 +01:00
Emil Velikov	168fdc6a07	bugzilla_mesa.sh: Drop "Bug " from sed command After a recent Bugzilla update the word is no longer in the title. Thus the script ended up producing bogus HTML. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `f35f8464ec`)	2016-07-07 16:12:34 +01:00
Akihiko Odaki	5193fe9f4f	mesa: don't install GLX files if GLX is not built Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Akihiko Odaki <akihiko.odaki.4i@stu.hosei.ac.jp> [Emil Velikov: Drop guards around dri_interface.h, add stable tag] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `42968424fb`)	2016-07-07 16:12:34 +01:00
Mathias Fröhlich	4a3d510b5b	osmesa: Export OSMesaCreateContextAttribs. Since the function is exported like any other public api function and put in the header as if you could link against it, export it also from shared objects. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `13affe0d3f`)	2016-07-07 16:12:34 +01:00
Rob Clark	63b7c6ffc8	glsl: add driconf to zero-init unintialized vars Some games are sloppy.. perhaps because it is defined behavior for DX or perhaps because nv blob driver defaults things to zero. So add driconf param to force uninitialized variables to default to zero. This issue was observed with rust, from steam store. But has surfaced elsewhere in the past. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `f78a6b1ce3`)	2016-07-07 16:12:33 +01:00
Rob Clark	3276443935	i965: don't drop const initializers in vector splitting Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `01ccb0d91e`)	2016-07-07 16:12:33 +01:00
Rob Clark	8a700c562c	freedreno: fix crash on smaller gpus and higher resolutions Devices with smaller GMEM size need more tiles. On db410c at 2048x1152, glmark2 shadow needed ~330 tiles for fullscreen. Lets bump it up to 512. (Maybe with MRT you could end up needing more, but at that point things are probably going to be painfully slow.) Signed-off-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit `7295428e41`)	2016-07-07 16:12:33 +01:00
Emil Velikov	dcc4df858a	anv: vulkan: remove the anv_device.$(OBJEXT) rule Atm the actual rule will expand to foo.o which is used for static libraries only. Thus the automake manual recommendation [to use OBJEXT] won't help us, since since we're working with a shared library. Thus let's 'demote' the file and add it back to BUILT_SOURCES. This will manage all the complexity for us, at the (existing expense) of working only with the all, check and install targets. The crazy (why the issue was hard to spot): If the dependencies (.deps/*.Plo) are already created one can alter the anv_device.$(OBJEXT) line and/or nuke it all together. That won't lead to any warnings/issues, even though the Makefile is regenerated. Moral of the story: Always rm -rf top_builddir or don't resolve the dependencies manually and use BUILT_SOURCES. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96825 Fixes: d7a604c3f7a ("anv: use cache uuid based on the build timestamp.") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Mark Janes <mark.a.janes@intel.com> (cherry picked from commit `9618e2a24c`)	2016-07-07 16:12:33 +01:00
Emil Velikov	dbb4c3c7c8	anv: install the intel_icd.json to ${datarootdir} by default As mentioned by the spec (and used by Archlinux and Debian) default to ${datarootdir} as opposed to ${sysconfdir} for the default location. Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `cbc37f72e3`)	2016-07-07 16:12:33 +01:00
Emil Velikov	7af5c2834c	swr: automake: don't ship LLVM version specific generated sources Otherwise things will fail to build, if the builder is using another version of LLVM. v2: annotate all the dependencies of builder_gen.h v3: clean the generated files as needed v4: comment cleanups (Tim) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Tested-by: Tim Rowley <timothy.o.rowley@intel.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com> (v2) Reported-by: Chuck Atkins <chuck.atkins@kitware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `744d0d8f3b`)	2016-07-07 16:12:33 +01:00
Emil Velikov	0e5e20dca0	automake: don't mandate git_sha1.h/MESA_GIT_SHA1 It has proven subtle to get it right both from the build side POV (see commit list below) and builders due to their varying workflows. Furthermore it does not fully fulfil the reason why it was enforced - to detect uniqueness between different builds, in order to distinguish and invalidate Vulkan/GL caches. With that having a much better solution (previous commit) we can drop this solution. This effectively reverts the following commits: `359d9dfec3` ("mesa: automake: add directory prefix for git_sha1.h") `2c424e00c3` ("mesa: automake: ensure that git_sha1.h.tmp has the right attributes") `b7f7ec7843` ("mesa: automake: distclean git_sha1.h when building OOT") `8229fe68b5` ("automake: get in-tree `make distclean' working again.") Cc: Timo Aaltonen <tjaalton@debian.org> Cc: Haixia Shi <hshi@chromium.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (cherry picked from commit `22e9357028`)	2016-07-07 16:12:33 +01:00
Emil Velikov	cc2c350416	anv: use cache uuid based on the build timestamp. Do not rely on the git sha1: - its current truncated form makes it less unique - it does not attribute for local (Vulkand or otherwise) changes Use a timestamp produced at the time of build. It's perfectly unique, unless someone explicitly thinkers with their system clock. Even then chances of producing the exact same one are very small, if not zero. v2: Remove .tmp rule. Its not needed since we want for the header to be regenerated on each time we call make (Eric). v3: - Honour SOURCE_DATE_EPOCH, to make the build reproducible (Michel) - Replace the generated header with a define, to prevent needless builds on consecutive `make' and/or `make install' calls. (Dave) v4: - Keep the timestamp generation at make time. (Jason) v5: - Ensure that file is regenerated on incremental builds. Cc: Michel Dänzer <michel@daenzer.net> Cc: Dave Airlie <airlied@gmail.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `addb099ce8`)	2016-07-07 16:12:33 +01:00
Emil Velikov	66fe2be1f5	clover: conditionally use MESA_GIT_SHA1 Considering how hard/annoying it was for many peoples' workflow to properly generate the macro, it will be demoted to conditionally available with follow-up commits. v2: Kill off gracious blank line (Vedran). Cc: mesa-stable@lists.freedesktop.org Cc: Vedran Miletić <vedran@miletic.net> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Vedran Miletić <vedran@miletic.net> (cherry picked from commit `f98530b739`)	2016-07-07 16:12:33 +01:00
Dave Airlie	568ba49673	Revert "st/glsl_to_tgsi: don't increase immediate index by 1." This reverts commit `27d456cc87`. DOH, what seems right and what is right with fp64 are always two different things. This regressed: spec@arb_gpu_shader_fp64@shader_storage@layout-std140-fp64-mixed-shader on radeonsi Reported-by: Michel Dänzer <michel@daenzer.net> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `cb728df967`)	2016-07-07 16:12:33 +01:00
Samuel Pitoiset	134523aa7d	nvc0/ir: reset the base offset for indirect images accesses In presence of an indirect image access, the base offset should be zeroed because the stride will be computed twice. This is a pretty rare situation but it can happen when tex.r > 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `f3b9fff3c3`)	2016-07-07 16:12:33 +01:00
Samuel Pitoiset	9f364ed35e	gm107/ir: fix sign bit emission for FADD32I When emitting OP_SUB, the sign bit for FADD and FADD32I is not at the same position. It's at position 45 for FADD but 51 for FADD32I. This fixes the following piglit test: tests/spec/arb_fragment_program/fdo30337b.shader_test Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `cb828b7b18`)	2016-07-07 16:12:32 +01:00
Lionel Landwerlin	d0e1d6b1c8	anv/wsi: create swapchain images using specified image usage The image usage specified by the caller of vkCreateSwapchainKHR should be passed onto the internal image creation. Otherwise the driver might later crash when the user tries to use the image as a combined sampler even though the creation was explicitly created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT. Leaving the previous VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT as this might be expected even if the swapchain is created without any flag. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96791 Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `dbbc4fb4cc`)	2016-07-07 16:12:32 +01:00
Dave Airlie	7d40db8cdb	st/glsl_to_tgsi: don't increase immediate index by 1. Immediates are stored into a separate table, and are consolidated, so if we get an immediate we don't need to offset it as the index it has is correct. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `27d456cc87`)	2016-07-07 16:12:32 +01:00
Nicolai Hähnle	250e13e585	st/mesa: check the texture image level in st_texture_match_image Otherwise, 1x1 images of arbitrarily high level are accepted. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96639#add_comment Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `07cc838b10`)	2016-07-07 16:12:32 +01:00
Nicolai Hähnle	5469fc9800	st/mesa: an incomplete texture may have a zero-size first image Fixes a regression introduced by commit `42624ea83` which triggered an assertion in dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0 While stImage must have a non-zero size as verified by the caller, we also look at the size of the base image in an attempt to make a better guess at the level0 size (this is important when the base image size is odd). However, the base image may have a zero size even when it exists. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96629 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `0ba053b34c`)	2016-07-07 16:12:32 +01:00
Chuck Atkins	951be8a50c	gallium: Force blend color to 16-byte alignment This aligns the 4-element color float array to 16 byte boundaries. This should allow compiler vectorizers to generate better optimizations. Also fixes broken vectorization generated by Intel compiler. v2: Fixed indentation and added a lengthy comment explaining the reason for the alignment. Cc: <mesa-stable@lists.freedesktop.org> Reported-by: Tim Rowley <timothy.o.rowley@intel.com> Tested-by: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Acked-by: Roland Scheidegger <sroland@vmware.com> (cherry picked from commit `d8d6091a84`)	2016-07-07 16:12:32 +01:00
Emil Velikov	b08e5a1940	Revert "swr: Refactor checks for compiler feature flags" This reverts commit a380199e3968462da8291e8dda25888f19e86783.	2016-07-07 16:12:32 +01:00
Chuck Atkins	2ad47d912e	swr: Refactor checks for compiler feature flags Encapsulate the test for which flags are needed to get a compiler to support certain features. Along with this, give various options to try for AVX and AVX2 support. Ideally we want to use specific instruction set feature flags, like -mavx2 for instance instead of -march=haswell, but the flags required for certain compilers are different. This allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c while the Intel compiler which doesn't support those flags can fall back to using -march=core-avx2. This addresses a bug where the Intel compiler will silently ignore the AVX2 instruction feature flags and then potentially fail to build. v2: Pass preprocessor-check argument as true-state instead of false-state for clarity. v3: Reduce AVX2 define test to just __AVX2__. Additional defines suchas __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined w.r.t thier availability. v4: Fix C++11 flags being added globally and add more logic to swr_require_cxx_feature_flags Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com> Tested-by: Tim Rowley <timothy.o.rowley@Intel.com> Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> (cherry picked from commit `c1bf6692be`) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-07-07 16:12:32 +01:00
Ian Romanick	bb819a9e21	mapi: Export all GLES 3.1 functions in libGLESv2.so Khronos recommends that the GLES 3.1 library also be called libGLESv2. It also requires that functions be statically linkable from that library. NOTE: Mesa has supported the EGL_KHR_get_all_proc_addresses extension since at least Mesa 10.5, so applications targeting Linux should use eglGetProcAddress to avoid problems running binaries on systems with older, non-GLES 3.1 libGLESv2 libraries. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Cc: Mike Gorchak <mike.gorchak.qnx@gmail.com> Reported-by: Mike Gorchak <mike.gorchak.qnx@gmail.com> Acked-by: Chad Versace <chad.versace@intel.com> (cherry picked from commit `5921f372c8`)	2016-07-07 16:12:32 +01:00
sonjiang	0076e14f53	radeon/uvd: fix a h265 context size bug Fixes a h265 video corruption bug which caused by uvd fw interface changes. Signed-off-by: sonjiang <sonny.jiang@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> (cherry picked from commit `b928ff6f62`)	2016-07-07 16:12:32 +01:00
sonjiang	930425df1e	radeon/uvd: separate uvd context buffer from DPB Adapt driver for Polairs uvd firmware interface changes. Signed-off-by: sonjiang <sonny.jiang@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> (cherry picked from commit `5c80354a23`)	2016-07-07 16:12:32 +01:00
sonjiang	700c1412e7	radeon: uvd add uvd fw version for amdgpu Because Polaris uvd fw interface changes, the driver need to check fw version to apply right interface. This change is to add uvd fw version. Signed-off-by: sonjiang <sonny.jiang@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> (cherry picked from commit `28f85eab49`) [Emil Velikov: resolve trivial s/bool/boolean/ conflicts] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/gallium/drivers/radeon/radeon_winsys.h	2016-07-07 16:12:31 +01:00
Samuel Pitoiset	1eaba7b5b3	gm107/ir: make sure that flagsDef is set when emitting setcond Rely on the existence of a second destination when emitting a setcond flag is dangerous, because this doesn't mean that the flag has been correctly set. Instead rely on flagsDef like what emitX() does for flagsSrc. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `cc97b6a34a`)	2016-07-07 16:12:31 +01:00
Marek Olšák	684e555aaa	radeonsi: set PA_SU_SMALL_PRIM_FILTER_CNTL register on Polaris This was missing. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `c1dbc563f4`)	2016-07-07 16:12:31 +01:00
Kenneth Graunke	dcc6dde5f9	i965: Make emit_urb_writes() not produce an EOT message for GS. emit_urb_writes() contains code to emit an EOT write with no actual data when there are no output varyings. This makes sense for the VS and TES stages, where it's called once at the end of the program. However, in the geometry shader stage, emit_urb_writes() is called once for every EmitVertex(). We explicitly emit a URB write with EOT set at the end of the shader, separately from this path. So we'd better not terminate the thread. This could get us into trouble for shaders which do EmitVertex() with no varyings followed by SSBO/image/atomic writes. It also caused us to emit multiple sends with EOT set, which apparently confuses the register allocator into not using g112-g127 for all but the first one. This caused EU validation failures in OglGSCloth shaders in shader-db. (The actual application was fine, but shader-db thinks there are no outputs because it doesn't understand transform feedback.) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (cherry picked from commit `7e7e501acf`)	2016-07-07 16:12:31 +01:00
Kenneth Graunke	6d47c3eb4e	glsl: Ignore ir_texture in lower_const_arrays_to_uniforms. The only part of an ir_texture which can be an array is the offsets array in textureGatherOffsets() calls. We don't want to lower those, because they're required to remain constants. Fixes textureGatherOffsets with Gallium drivers such as llvmpipe, which commit `ef78df8d3b` regressed. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `a36a73a7b8`)	2016-07-07 16:12:31 +01:00
Samuel Pitoiset	7f0984ca51	gm107/ir: add missing setcond flags for LOP variants Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `7b9b096775`)	2016-07-07 16:12:31 +01:00
Samuel Pitoiset	94bfef8a71	gm107/ir: make use of LOP32I for all immediates LOP only allows to emit 19-bits immediates. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `83a4f28dc2`)	2016-07-07 16:12:31 +01:00
Dave Airlie	2bd400d953	virgl: reduce some limits for now These need to be passed from the host in caps structure if they are larger, this fixes a bunch of tests on Intel hw, that I'd put the limits too high for. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `c7cc264ca9`)	2016-07-07 16:12:31 +01:00
Samuel Pitoiset	d4ab5bcad5	gm107/ir: make use of MOV32I for all immediates MOV only allows to emit 19-bits immediates. This is similar to the previous fix I did for IMUL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `c7fa3c92f8`)	2016-07-07 16:12:31 +01:00
Jordan Justen	b82362f52c	i965: Use miptree to decide format on multi-plane images for gen < 7 This wasn't handled correctly for multi-plane images on gen < 7 in `727a9b2493`. Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96674 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `367cf3a2e3`)	2016-07-07 16:12:31 +01:00
Samuel Pitoiset	2fc24cfd8c	gm107/ir: make use of IMUL32I for all immediates IMUL only allows to emit 19-bits immediates. This is similar to `d30768025a` which fixed the same thing for the GK110 emitter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b84c97587b`)	2016-07-07 16:12:30 +01:00
Jordan Justen	507d19c44f	i965: Skip update_texture_surface when the plane doesn't exist Reported-by: Grazvydas Ignotas <notasas@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96607 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chad Versace <chad.versace@intel.com> (cherry picked from commit `727a9b2493`)	2016-07-07 16:12:30 +01:00
Kenneth Graunke	a6a246e17c	i965: Set fs_inst::base_mrf = -1 by default. On MRF platforms, we need to set base_mrf to the first MRF value we'd like to use for the message. On send-from-GRF platforms, we set it to -1 to indicate that the operation doesn't use MRFs. As MRF platforms are becoming increasingly a thing of the past, we've forgotten to bother with this. It makes more sense to set it to -1 by default, so we don't have to think about it for new code. I searched the code for every instance of 'mlen =' in brw_fs*cpp, and it appears that all MRF-based messages correctly program a base_mrf. Forgetting to set base_mrf = -1 can confuse the register allocator, causing it to think we have a large fake-MRF region. This ends up moving the send-with-EOT registers earlier, sometimes even out of the g112-g127 range, which is illegal. For example, this fixes illegal sends in Piglit's arb_gpu_shader_fp64-layout-std430-fp64-shader, which had SSBO messages with mlen > 0 but base_mrf == 0. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `3e04e3758e`)	2016-07-07 16:12:30 +01:00
Marek Olšák	a51a9d7ba3	radeonsi: fix fractional odd tessellation spacing for Polaris ported from Vulkan (and no source explains why this is needed) Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `28d0d0c5b4`)	2016-07-07 16:12:30 +01:00
Marek Olšák	4f784775a7	radeonsi: fix a compute shader hang with big threadgroups on SI & CI ported from Vulkan Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `1e8adb0ee4`) [Emil Velikov: resolve trivial conflict in si_launch_grid()] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Conflicts: src/gallium/drivers/radeonsi/si_compute.c	2016-07-07 16:12:30 +01:00
Ilia Mirkin	48fe283158	nvc0: when mapping directly, provide accurate xfer info + start We were ignoring the incoming box parameters, and were providing totally bogus stride/layer stride, and other bits, for when a non-full-surface map was requested. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b433cb51e5`)	2016-06-24 21:33:24 +01:00
Nicolai Hähnle	f41f78cda1	radeonsi: drop the DRAW_PREAMBLE packet on Polaris It will be removed from the firmware for the Polaris. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `0da890e62c`)	2016-06-24 21:31:45 +01:00
Nicolai Hähnle	197e2eaea8	radeonsi: use DRAW_(INDEX_)INDIRECT_MULTI on Polaris The non-MULTI variants will be removed in Polaris firmware. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `2aa0485902`)	2016-06-24 21:30:37 +01:00
Jordan Justen	eadccf8c67	i965: Preserve the internal format of the dri image Since the OpenGLES API is strict about the internal format matching the for many operations, we need to preserve it. See _mesa_es3_error_check_format_and_type in src/mesa/main/glformats.c. Fixes ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96351 Reported-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chad Versace <chad.versace@intel.com> (cherry picked from commit `c36a363a2d`)	2016-06-24 21:29:47 +01:00
Kenneth Graunke	1fc705366c	i965: Implement rasterizer discard via SOL unless required for queries. We currently use CL_INVOCATION_COUNT for the GL_PRIMITIVES_GENERATED query, which involves passing all primitives to the clipper. When rasterizer discard is enabled, we program the clipper in REJECT_ALL mode, rather than using the SOL stage's "Rendering Disable" feature. See commit `f09b91f782` for an explanation of why we implement GL_PRIMITIVES_GENERATED this way. Apparently the SOL stage's "Rendering Disable" feature is a lot faster than having the clipper reject all primitives. It's safe to use when no GL_PRIMITIVES_GENERATED query is active, as we don't care about CL_INVOCATION_COUNT incrementing. This patch makes us use SO_RENDERING_DISABLE when no query is active, but continues falling back to the clipper in REJECT_ALL mode when the queries are enabled. It brings back the perf_debug for the clipper case (which I removed in commit `1f9445ff57`, thinking it wasn't useful). Improves performance in Gl32GSCloth by 84.8303% +/- 2.07132% (n = 10) on my Broadwell GT2 laptop. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `b0629e6894`)	2016-06-24 21:28:59 +01:00
Kenneth Graunke	91de94a119	i965: Combine 3DSTATE_STREAMOUT emitters and genX_sol_state atoms. They're basically the same. Let's avoid the code duplication. v2: Fix SO_BUFFER_ENABLE stuff to only happen on Gen < 8 (caught by Jason Ekstrand). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `4db98f8beb`)	2016-06-24 21:27:28 +01:00
Kenneth Graunke	d7ea3eada7	glsl: Don't constant propagate arrays. Constant propagation on arrays doesn't make a lot of sense. If the array is only accessed with constant indexes, then opt_array_splitting would split it up. Otherwise, we have variable indexing. If there's multiple accesses, then constant propagation would end up replicating the data. The lower_const_arrays_to_uniforms pass creates uniforms for each ir_constant with array type that it encounters. This means that it creates redundant uniforms for each copy of the constant, which means uploading too much data. It can even mean exceeding the maximum number of uniform components, causing link failures. We could try and teach the pass to de-duplicate the data by hashing constants, but it makes more sense to avoid duplicating it in the first place. We should promote constant arrays to uniforms, then propagate the uniform access. Fixes the TressFX shaders from Tomb Raider, which exceeded the maximum number of uniform components by a huge margin and failed to link. On Broadwell: total instructions in shared programs: 9067702 -> 9068202 (0.01%) instructions in affected programs: 10335 -> 10835 (4.84%) helped: 10 (Hoard, Shadow of Mordor, Amnesia: The Dark Descent) HURT: 20 (Natural Selection 2) loops in affected programs: 4 -> 0 The hurt programs appear to no longer have a constarray uniform, as all constants were successfully propagated. Apparently before this patch, we successfully unrolled a loop containing array access, but only after promoting constant arrays to uniforms. With this patch, we unroll it first, so all array access is direct, and the array is split up, and individual constants are propagated. This seems better. Cc: mesa-stable@lists.freedesktop.org Reported-by: Karol Herbst <nouveau@karolherbst.de> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `fb857b5eea`)	2016-06-24 21:26:39 +01:00
Kenneth Graunke	0af4f5c1ba	glsl: Make lower_const_arrays_to_uniforms work directly on constants. There's really no point in looking at ir_dereference_array of a constant. It also misses cases like: (assign () (var_ref tmp) (constant (array ...) ...)) No changes in shader-db, but keeps it working after the next commit. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `ef78df8d3b`)	2016-06-24 21:25:50 +01:00
Kenneth Graunke	335193107a	i965: Copy propagate before doing variable index lowering. The scalar backend currently doesn't support variable indexing on temporary arrays, but it does support it on uniform arrays, and some stages support it for input arrays. Make sure these are propagated through before exploding indirects into piles of if-ladders unnecessarily. On Broadwell, no instruction count change in shader-db. total cycles in shared programs: 80675652 -> 80674928 (-0.00%) cycles in affected programs: 649972 -> 649248 (-0.11%) helped: 386 HURT: 165 This will help avoid code quality regressions in a future commit. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `f7741c5211`)	2016-06-24 21:25:00 +01:00
Kenneth Graunke	32bb867118	glsl: Propagate invariant/precise after lowering const arrays. The new uniform may need precise as well. Fixes copy propagation of constant array uniforms in Tomb Raider shaders. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `586f4a42e7`)	2016-06-24 21:24:09 +01:00
Kenneth Graunke	60b5ba557e	glsl: Split arrays even in the presence of whole-array copies. Previously, we failed to split constant arrays. Code such as int[2] numbers = int[](1, 2); would generates a whole-array assignment: (assign () (var_ref numbers) (constant (array int 4) (constant int 1) (constant int 2))) opt_array_splitting generally tried to visit ir_dereference_array nodes, and avoid recursing into the inner ir_dereference_variable. So if it ever saw a ir_dereference_variable, it assumed this was a whole-array read and bailed. However, in the above case, there's no array deref, and we can totally handle it - we just have to "unroll" the assignment, creating assignments for each element. This was mitigated by the fact that we constant propagate whole arrays, so a dereference of a single component would usually get the desired single value anyway. However, I plan to stop doing that shortly; early experiments with disabling constant propagation of arrays revealed this shortcoming. This patch causes some arrays in Gl32GSCloth's geometry shaders to be split, which allows other optimizations to eliminate unused GS inputs. The VS then doesn't have to write them, which eliminates the entire VS (5 -> 2 instructions). It still renders correctly. No other change in shader-db. v2: Drop !AOA check and improve a comment (feedback from Tim Arceri). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `c264fdbc07`)	2016-06-24 21:23:20 +01:00
Kenneth Graunke	9013f56bb7	glsl: Make constant propagation's folder not propagate into an LHS. opt_constant_propagation.cpp contains constant folding code which can actually do constant propagation in some cases. It was happily propagating constants into the left-hand-side of assignments. For example, (assign () (var_ref temp) (constant ...)) would brilliantly be turned into: (assign () (constant ...) (constant ....)) This is a bigger hammer than necessary - it prevents propagation into the left-hand-side altogether. We could certainly do better someday. Notably, the constant propagation pass itself already takes this approach - it's just the constant propagation pass's built-in constant folding code (which actually propagates, too) that was broken. No change in shader-db, but prevents regressions after future commits. It seems plausible that this could be hit today, but I haven't seen it happen. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `acf5444044`)	2016-06-24 21:22:31 +01:00
Ardinartsev Nikita	133d0f0882	i965: Avoid division by zero. Fixes regression introduced by `af5ca43f26` Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419 (cherry picked from commit `01c89ccc5d`)	2016-06-24 21:21:34 +01:00
Tim Rowley	1e8fb90f19	swr: push/pop DEBUG macro around llvm includes llvm redefines DEBUG; adding push/pop prevents a undefined reference to debug_refcnt_state in llvm-3.7+. v2: add undef DEBUG Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> (cherry picked from commit `9ca741c645`)	2016-06-24 21:20:28 +01:00
Jose Fonseca	6c1911effb	include: Require MSVC 2013 Update 4. Earlier MSVC 2013 releases have troubles compiling some of our C99 code, so make sure we have Update 4 to avoid confusion. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> (cherry picked from commit `805dbdf06d`)	2016-06-24 20:58:23 +01:00
Jason Ekstrand	aefcbf41ef	anv: Use different BOs for different scratch sizes and stages This solves a race condition where we can end up having different stages stomp on each other because they're all trying to scratch in the same BO but they have different views of its layout. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `c2f2c8e407`)	2016-06-24 20:57:17 +01:00
Jason Ekstrand	892cbc202c	genxml: Make ScratchSpaceBasePointer an address instead of an offset While we're here, we also fixup MEDIA_VFE_STATE and rename the field in 3DSTATE_VS on gen6-7.5 to be consistent with the others. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `45c0f60999`)	2016-06-24 20:56:10 +01:00
Jason Ekstrand	7a4641cdbe	anv: Add an allocator for scratch buffers Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `966bed17c1`)	2016-06-24 20:55:06 +01:00
Jason Ekstrand	13d82b7690	genxml: Put append counter fields before MCS in RENDER_SURFACE_STATE on gen7 The pack header generation scripts can't handle the case where you have two addresses in the same dword; they just take whatever is the last one. This meant that the MCS address wasn't properly getting handled. Since we don't care about append counters, we can just re-arrange the XML for now. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `89ded099f8`)	2016-06-24 20:53:35 +01:00
Jason Ekstrand	94cd7425e8	anv,isl: Lower storage image formats in anv ISL was being a bit too clever for its own good and lowering the format for us. This is all well and good if we always want to lower it. However, the GL driver selectively lowers the format depending on whether the surface is write-only or not. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `d82322eb18`)	2016-06-24 20:52:38 +01:00
Jason Ekstrand	40a9ffbbca	isl/state: Allow for full 31-bit buffer texture sizes Ivy Bridge and above can handle up to 2^31 elements for RAW buffer surfaces. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `97f12773b8`)	2016-06-24 20:51:48 +01:00
Jason Ekstrand	02bf08e124	isl/state: Don't use designated initializers for buffer surface state Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `bb64e666ba`)	2016-06-24 20:50:53 +01:00
Jason Ekstrand	feaa68e38a	isl/state: Add assertions for buffer surface restrictions Acked-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `4061fde66e`)	2016-06-24 20:49:57 +01:00
Jason Ekstrand	72cc8544a8	isl/state: Don't set SurfacePitch for gen9 1-D textures This field is ignored by the hardware in this case and, on very large 1-D textures, it can end up being larger than the maximum allowed value. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `ce24097abe`)	2016-06-24 20:49:05 +01:00
Jason Ekstrand	fcefb53c37	isl/state: Use TILEWALK_XMAJOR for linear surfaces on gen7 This matches better what happens on gen8 where the "Tiled Surface" and "Tile Walke" bits are combined into a single two-bit value. This is also more consistent with what the GL driver does. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `f47e23a8b6`)	2016-06-24 20:48:09 +01:00
Jason Ekstrand	913e9e14f0	isl/state: Emit no-op mip tail setup on SKL This hasn't ever been a problem in the past but it is recommended by the hardware docs. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `96706bad5f`)	2016-06-24 20:47:21 +01:00
Jason Ekstrand	a49f97fae3	isl/state: Only set cube face enables if usage includes CUBE_BIT It seems safe to set it all the time, but this reduces the diff between the way i965 does it and what ISL does. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `14d7c16e50`)	2016-06-24 20:46:31 +01:00
Jason Ekstrand	672872051d	isl/state: Use the layout for computing qpitch rather than dimensions For depth/stencil 1-D textures on SKL, we want them layed out in the old format that has been used since gen4. In order for the surface state fill-out code to handle, this it needs to distinguish based on layout rather than just dimensionality. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `5d24e9cfa1`)	2016-06-24 20:45:39 +01:00
Jason Ekstrand	667beb92a9	isl/state: Set the IntegerSurfaceFormat bit on Haswell This fixes 688 Vulkan CTS tests on Haswell. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `6a43204afa`)	2016-06-24 20:44:46 +01:00
Jason Ekstrand	262282c1bf	isl/format: Mark R9G9B9E5 as containing 9-bit unsigned float channels Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `324103da75`)	2016-06-24 20:43:55 +01:00
Jason Ekstrand	350ae65585	isl/state: Don't set RenderTargetViewExtent for texture surfaces The docs specify that this only matters for render targets and surfaces used with typed dataport messages. On some platforms (gen4-6) the Depth field has more bits than RenderTargetViewExtent so we can have textures with more levels than we can render to. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `215282c9f4`)	2016-06-24 20:43:01 +01:00
Jason Ekstrand	415869c5c9	isl/state: Set SurfaceArray based on the surface dimension According to the PRM, you can't set SurfaceArray for 3D or buffer textures. There doesn't seem to be a good reason not to set it when we can. On the other hand, if we don't set it we can end up getting strange results for 1-layer array textures such as textureSize() returning the wrong results. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `bb326f7b01`)	2016-06-24 20:42:07 +01:00
Jason Ekstrand	6a3f08be3a	isl/state: Don't force-disable L2 bypass for everything We already set the bit in the few cases where it's required by the docs so there's no need to set it all the time. This has no noticable perf impact for Dota 2 on Vulkan with the time demo I have. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `d050ffbce9`)	2016-06-24 20:41:15 +01:00
Jason Ekstrand	0315650532	isl/state: Refactor the setup of clear colors This commit switches clear colors to use #if's instead of a C if. This lets us properly handle SNB where the clear color field doesn't exist. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `87f0ffa646`)	2016-06-24 20:40:19 +01:00
Jason Ekstrand	9259e0f990	isl/state: Refactor the per-gen isl_to_gen_h/valign tables This moves the #if's around so that halign and valign have different sets of #if conditions. This also prepares us for SNB because isl_to_gen_halign is not defined at all on gen6. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `62a5e6e031`)	2016-06-24 20:39:23 +01:00
Jason Ekstrand	dbc94da586	isl/state: Return an extent3d from the halign/valign helper Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b1b0d6fb54`)	2016-06-24 20:38:27 +01:00
Jason Ekstrand	1dd276aa7c	isl/state: Put pitch calculations together This is purely cosmetic, but it makes things look a bit more readable. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a60ae9e10a`)	2016-06-24 20:37:27 +01:00
Jason Ekstrand	652161bdc8	isl/state: Put all dimension setup together and towards the top This is purely cosmetic, but it makes things look a bit more readable. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `70c8afc0c8`)	2016-06-24 20:36:22 +01:00
Jason Ekstrand	29b24d75eb	isl/state: Put surface format setup at the top This is purely cosmetic, but it makes things look a bit more readable. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `e66e70ef47`)	2016-06-24 20:35:22 +01:00
Jason Ekstrand	8b3333d1df	isl/state: Remove some unused fields They're already zero-initialized and we have no plans of doing anything more interesting with them. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `39baea551f`)	2016-06-24 20:34:30 +01:00
Jason Ekstrand	bf59ce8869	isl/state: Don't use designated initializers for the surface state While designated initializers are nice, they also force us to put some things in the initializer and some things later. Surface state setup is complicated enough that this really hurts readability in the long run. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `caf2af4181`)	2016-06-24 20:33:35 +01:00
Jason Ekstrand	0a7671a309	genxml/gen8,9: Prefix the multisample format enum with MSFMT This is what gen7 does and it's nice to have a prefix Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `de1d194856`)	2016-06-24 20:32:34 +01:00
Jason Ekstrand	69234ef45e	i965/gen4: Subtract 1 from buffer sizes The PRM states that the values put in Width, Height, and Depth should be various bits from the value size - 1. We seem to have done this wrong more-or-less from the start. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `2a1cc94d27`)	2016-06-24 20:31:42 +01:00
Jason Ekstrand	2681454102	i965/fs: Use a default Y coordinate of 0 for TXF on gen9+ Previously, we were incrementing length but not actually putting anything in the Y coordinate. This meant that 1-D TXF operations had a garbage array index. If the surface is emitted as 1-D non-array, the coordinate gets discarded and it works fine. If it happens to be bound as an array surface, it may count as an out-of-bounds array access and you get zero. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `0195299c86`)	2016-06-24 20:30:48 +01:00
Jason Ekstrand	e9fd680fde	i965/gen8: Use the qpitch from the aux_mt for AUX_QPITCH Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `1436238b75`)	2016-06-24 20:29:56 +01:00
Jason Ekstrand	6a6947d89a	i965/blorp/gen8: Use the correct max level and layer in emit_surface_states We were adding in the base which is wrong because the values given in the miptree are relative to zero and not the base layer/level. Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `620f81d2ed`)	2016-06-24 20:29:03 +01:00
Jason Ekstrand	1673dec65c	i965: Drop the maximum 3D texture size to 512 on Sandy Bridge The RenderTargetViewExtent field of RENDER_SURFACE_STATE is supposed to be set to the depth of a 3-D texture when rendering. Unfortunatley, that field is only 9 bits on Sandy Bridge and prior so we can't actually bind a 3-D texturing for rendering if it has depth > 512. On Ivy Bridge, this field was bumpped to 11 bits so we can go all the way up to 2048. On Iron Lake and prior, we don't support layered rendering and we use OffsetX/Y hacks to render to particular layers so 2048 is ok there too. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `6ba88bce64`)	2016-06-24 20:28:06 +01:00
Jason Ekstrand	af12f81147	i965/gen4-6: Handle gl_texture_object::BaseLevel and MinLayer correctly This is basically a direct translation of what we do for gen7. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036 Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `0f9cd74aab`)	2016-06-24 20:27:12 +01:00
Jason Ekstrand	d9219b5b79	i965/gen4: Pull texture formats from the texture object not the miptree This makes texture views sort-of work. It doesn't add full texture view support for gen4-5 but it is enough to fix the GL_ARB_copy_image formats piglit test on Iron Lake. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036 Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `ee39d3ba91`)	2016-06-24 20:26:14 +01:00
Ilia Mirkin	abfed13bf4	glsl: only match gl_FragData and not gl_SecondaryFragDataEXT There's special logic around finding gl_FragData. It latches onto any array with FRAG_RESULT_DATA0. However gl_SecondaryFragDataEXT[], added by GL_EXT_blend_func_extended, fits those parameters as well. The real frag data array should have index 0 though, so we can use that to distinguish them. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96617 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `36ed1b695e`)	2016-06-24 20:25:10 +01:00
Ilia Mirkin	8ac0a713f7	nv50,nvc0: fix start_instance in manual push path The start instance is applied as an offset into the buffer directly, ignoring the divisor, not as an instance id offset that respects the divisor. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `1f4bca798d`)	2016-06-24 20:23:49 +01:00
Ilia Mirkin	f7af3868f7	translate: fix start_instance parameter in sse version The generic version gets this right already, but this was using an incorrect formula in SSE. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `5b0d64886d`)	2016-06-24 20:22:16 +01:00
Jason Ekstrand	15d06d4d61	anv/cmd: Dirty descriptor sets when a new pipeline is bound Ever since `c2581a9375`, the binding table layout has depended on the pipeline. This means that whenever we change pipelines we also need to re-emit binding tables for the new layout. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `35b53c8d47`)	2016-06-24 20:21:18 +01:00
Jason Ekstrand	6fd7d618f4	anv/cmd: Move emit_descriptor_pointers to genX_cmd_buffer.c It's tiny and fully generic so there's really no reason for it to be in a gen7-specific file. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `2bfe0c3374`)	2016-06-24 20:20:21 +01:00
Jason Ekstrand	045d6bc023	anv/cmd: Move flush_descriptor_sets to anv_cmd_buffer.c There's no good reason for recompiling it Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `9df4d6bb36`)	2016-06-24 20:19:11 +01:00
Jason Ekstrand	b2fe134064	spirv: Use the system value version of gl_FrontFace SPIR-V treats it as an input but NIR wants the system value. This shouldn't have been too much of a surprise given that we have to do the same conversion in the GLSL IR to NIR pass. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `295e03c980`)	2016-06-24 20:03:46 +01:00
Kenneth Graunke	2e8129ddf8	i965: Reorganize prog_data->total_scratch code a bit. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `40013c5033`)	2016-06-24 18:17:50 +01:00
Emil Velikov	5e0b11cb6d	Update version to 12.0.0-rc4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-21 13:32:04 +01:00
Nicolai Hähnle	6306930c3f	st/mesa: flush bitmap cache before CopyImageSubData Found by inspection. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `f9ddd52317`)	2016-06-21 11:53:55 +01:00
Nicolai Hähnle	76377387c2	st/mesa: flush bitmap cache before texture functions As far as I can tell, a sequence of glBitmap followed by texture functions that refer to a texture bound as the framebuffer is well within what should be allowed. Found by inspection. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `e7fff3cfe1`)	2016-06-21 11:52:36 +01:00
Nicolai Hähnle	6775b169cd	st/mesa: flush bitmap cache before compute dispatch In the unlikely case that a program uses glBitmap to render to a framebuffer whose texture is bound in a compute shader. Found by inspection. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `c542b7e43d`)	2016-06-21 11:51:20 +01:00
Kenneth Graunke	a0235eb0f7	i965: Fix multiplication of immediates on Cherryview/Broxton. Cherryview and Broxton don't support DW x DW multiplication. We have piles of code to handle this, but apparently weren't retyping in the immediate case. For example, tests/spec/arb_tessellation_shader/execution/dvec3-vs-tcs-tes makes the simulator angry about instructions such as: mul(8) r18<1>:D r10.0<8;8,1>:D 0x00000003:D Just retype to W or UW. It should be safe on all platforms. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `cd89c834a8`)	2016-06-21 11:49:55 +01:00
Jason Ekstrand	09a098bdeb	anv: Add proper support for depth clamping Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `eb6764c4a7`)	2016-06-21 11:48:39 +01:00
Jason Ekstrand	f3c8dde2e4	anv/cmd_buffer: Split emit_viewport in two Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `8a46b505cb`)	2016-06-21 11:47:20 +01:00
Jason Ekstrand	3fddb9fd46	anv/cmd_buffer: Set depth/stencil extent based on the image It used to be based on the framebuffer which isn't quite right. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `20e95a746d`)	2016-06-21 11:46:03 +01:00
Jason Ekstrand	f614a1f4d8	anv/cmd_buffer: Don't crash if push constants are provided for missing stages Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b65f2e4163`)	2016-06-21 11:44:48 +01:00
Jason Ekstrand	f4bc7218d5	anv/pipeline: Do invariance propagation on SPIR-V shaders Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `e6c2fe4519`)	2016-06-21 11:43:29 +01:00
Jason Ekstrand	77f241bd37	nir/alu_to_scalar: Respect the exact ALU operation qualifier Just setting builder->exact isn't sufficient because that only applies to instructions that are built with the builder but instructions created manually and only inserted using the builder are left alone. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `bec07b7292`)	2016-06-21 11:41:49 +01:00
Jason Ekstrand	deedb368de	nir: Add a pass for propagating invariant decorations This pass is similar to propagate_invariance in the GLSL compiler. The real "output" of this pass is that any algebraic operations which are eventually consumed by an invariant variable get marked as "exact". Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `202751fbb7`)	2016-06-21 11:37:37 +01:00
Jason Ekstrand	bac23b13eb	nir/algebraic: Remove imprecise flog2 optimizations While mathematically correct, these two optimizations result in an expression with substantially lower precision than the original. For any positive finite floating-point value, log2(x) is well-defined and finite. More precisely, it is in the range [-150, 150] so any sum of logarithms log2(a) + log2(b) is also well-defined and finite as long as a and b are both positive and finite. However, if a and b are either very small or very large, their product may get flushed to infinity or zero causing log2(a * b) to be nowhere close to log2(a) + log2(b). This imprecision was causing incorrect rendering in Talos Principal because part of its HDR rendering process involves doing 8 texture operations, clamping the result to [0, 65000], taking a dot-product with a constant, and then taking the log2. This is done 6 or 8 times and summed to produce the final result which is written to a red texture. In cases where you have a region of the screen that is very dark, it can end up getting a result value of -inf which is not what is intended. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96425 Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `68e308d853`)	2016-06-21 11:36:08 +01:00
Nicolai Hähnle	b03b256e92	radeonsi: fix calculation of valid RB mask per SE The old calculation treated too many RBs as disabled. Cc: 11.0 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `c95175581e`)	2016-06-21 11:34:38 +01:00
Nicolai Hähnle	52ae654569	radeonsi: raise SI_PM4_MAX_DW The old limit, introduced in commit `afa752d3f0`, was exceeded by 4 SE configurations which hit si_write_harvested_raster_configs. Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `6c2e636982`)	2016-06-21 11:33:00 +01:00
Roland Scheidegger	f675339b22	gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9 Apparently, these are deprecated. There's some AutoUpgrade feature which is supposed to promote these to cmp/select, which apparently doesn't work with jit code. It is possible it's not actually even meant to work (see the bug filed against llvm which couldn't provide an answer neither) but in any case this is meant to be only temporary unless the intrinsics are really illegal. So, just use the fallback code (which should be cmp/select, we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages to optimize this back to pmin/pmax in the end). This addresses https://llvm.org/bugs/show_bug.cgi?id=28176 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Aaron Watry <awatry@gmail.com> (cherry picked from commit `b0cf99165a`)	2016-06-21 11:31:08 +01:00
Ilia Mirkin	cdbcd315b3	nvc0: don't make use of push hint if there are no non-const user vbos This makes the check match up what we do on nv50 as well - there's no point in switching over the push path if everything's in managed buffers. This can happen when a shader uses a vertex without an enabled array - we end up passing it a constant attribute. This also has the effect of "fixing" some flickering in Talos. I have no idea why. I've stared at the push logic forwards, backwards, and sideways. By always forcing the push path (which is slow), the flickering also goes away, but other rendering is still wrong (specifically draw 383068 as identified in the bug). However by not switching over to the push path, draw 383068 is correct. Note that other flickering remains in Talos, like the red/green walls/floors. This takes care of the shadow flickering though. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90513 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `154c0a42a2`)	2016-06-21 11:29:27 +01:00
Ilia Mirkin	7f1a4dc740	gk104/ir: fix tex use generation to be more careful about eliding uses If we have a loop, instructions before the tex might be added as tex uses, and those may in fact dominate all other uses of the tex results. This however doesn't mean that we don't need a texbar after the tex. Only check if uses dominate each other they are dominated by the tex. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96565 Fixes: `7752bbc44` (gk104/ir: simplify and fool-proof texbar algorithm) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `1804aa0b80`)	2016-06-21 11:27:50 +01:00
Samuel Iglesias Gonsálvez	97440cc2ed	i965/fs: indirect addressing with doubles is not supported in CHV/BSW/BXT From the Cherryview's PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions, page 844: "When source or destination datatype is 64b or operation is integer DWord multiply, indirect addressing must not be used." v2: - Fix it for Broxton too. v3: - Simplify code by using subscript() and not creating a new num_components variable (Kenneth). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `bdab572a86`)	2016-06-17 14:41:16 +01:00
Iago Toral Quiroga	3265becac3	i965/fs: Fix single-precision to double-precision conversions for CHV/BSW/BXT From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions: "When source or destination is 64b (...), regioning in Align1 must follow these rules: 1. Source and destination horizontal stride must be aligned to the same qword. (...)" v2: - Fix it for Broxton too. v3: - Remove inst->regs_written change as it is not necessary (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `0177dbb6c2`)	2016-06-17 14:40:12 +01:00
Ian Romanick	033279c961	mesa: If validation fails in a debug context just emit a debug message There are quite a few pipelines that desktop applications (including a bunch of piglit test) can expect to have run but don't meet the GLES requirements. Instead of failing validation, just emit a debug message. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `6bec55a780`)	2016-06-17 14:39:16 +01:00
Ian Romanick	6572273631	glsl: Always strip arrayness in precision_qualifier_allowed Previously some callers of precision_qualifier_allowed would strip the arrayness from the type and some would not. As a result, some places would not notice that float[6], for example, needed a precision qualifier. Fixes the new piglit test no-default-float-array-precision.frag. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> (cherry picked from commit `9c87282041`)	2016-06-17 14:38:08 +01:00
Kenneth Graunke	dab4a6001b	i965: Use a uniform for gl_PatchVerticesIn in the TCS on Gen8+. We still need to recompile the passthrough shader when this value changes, as it also affects the output vertex count. But otherwise, we can eliminate recompiles on Gen8+. We probably want to do this for Gen7 as well, but that requires rewriting the input release code to use a loop, which is a trade-off I'd need to consider in more detail. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `c319512e16`)	2016-06-17 14:37:06 +01:00
Kenneth Graunke	286ed3aff0	glsl: Optionally lower TCS gl_PatchVerticesIn to a uniform. i965 has no special hardware for this, so the best way to implement this is to pass it in via a uniform. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `2b867264d2`)	2016-06-17 14:28:44 +01:00
Kenneth Graunke	baa6ef4ed0	i965: Use a uniform for gl_PatchVerticesIn in the TES. Fixes three GL44-CTS.tessellation_shader subtests: - max_patch_vertices - single.max_patch_vertices - tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn These use gl_PatchVerticesIn in the TES, but don't link against a TCS (which would allow the linker to lower it to a constant). We had no handling for the system value in the backend, so it would just assert fail. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `1bc194cd64`)	2016-06-17 14:27:42 +01:00
Kenneth Graunke	b7e91a0421	glsl: Optionally lower TES gl_PatchVerticesIn to a uniform. i965 has no special hardware for this, so we need to pass this value in as a uniform (unless the TES is linked against a TCS, in which case the linker can just replace this with a constant). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `0be2105137`)	2016-06-17 14:19:26 +01:00
Nicolai Hähnle	05c5ed47d1	mesa/main: fix integer overflows in _mesa_image_offset Found using -fsanitize=undefined. Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> (cherry picked from commit `6510e07345`)	2016-06-17 14:18:27 +01:00
Kenneth Graunke	a9647850d1	mesa: Pass gl_constant_value union into _mesa_fetch_state(). We've had some trouble in the past with copying integers around via float pointers, as the C compiler sometimes uses x87 floating point registers to load values on 32-bit systems. Passing the gl_constant_value union should be safer. To avoid churn, this patch creates a "GLfloat *value" variable so existing uses can stay the same. Not observed to fix anything, but I was in the area adding more integer state vars, and thought it'd be wise. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `8b408972ff`)	2016-06-17 14:01:23 +01:00
Emil Velikov	7d41c8aa25	Update version to 12.0.0-rc3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-15 09:29:14 +01:00
Nicolai Hähnle	575f9eaa2d	radeonsi: mark buffer texture range valid for shader images When a shader image view into a buffer texture can be written to, the buffer's valid range must be updated, or subsequent transfers may incorrectly skip synchronization. This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels, reported by Michel Dänzer. Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `a64c7cd2ba`) Back-ported from commit `a64c7cd2ba`: - include util/u_format.h - code was extracted to si_set_shader_image in master, move it back Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> -- src/gallium/drivers/radeonsi/si_descriptors.c \| 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)	2016-06-15 09:29:14 +01:00
Ilia Mirkin	792a5ee425	nv50/ir: record number of threads in a compute shader Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `27a51ff9b4`)	2016-06-15 09:29:14 +01:00
Ilia Mirkin	59841f5466	nvc0/ir: limit max number of regs based on availability in SM This effectively limits registers to 32 and 64 for fermi and kepler when 1024 threads are used, but allows the full amount to be used with smaller thread sizes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `1f895caba0`)	2016-06-15 09:29:14 +01:00
Tomasz Figa	966ee94558	i965: Check return value of screen->image.loader->getBuffers (v2) The images struct is an uninitialized local variable on the stack. If the callback returns 0, the struct might not have been updated and so should be considered uninitialized. Currently the code ignores the return value, which (depending on stack contents) might end up in reading a non-zero value from images.image_mask and dereferencing further fields. Another solution would be to initialize image_mask with 0, but checking the return value seems more sensible and it is what Gallium is doing. v2: fix typos in commit message, fix indentation, remove unnecessary parentheses and pointer dereference to keep line length reasonable. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `e7ab358e81`)	2016-06-15 09:29:14 +01:00
Dylan Baker	8ed5204182	isl: Replace bash generator with python generator This replaces the current bash generator with a python based generator using mako. It's quite fast and works with both python 2.7 and python 3.5, and should work with 3.3+ and maybe even 3.2. It produces an almost identical file except for a minor layout changes, and the addition of a "generated file, do not edit" warning. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `5a87bc7181`)	2016-06-15 09:29:14 +01:00
Bas Nieuwenhuizen	28294573c7	radeonsi: Reinitialize all descriptors in CE preamble. This fixes a problem with the CE preamble and restoring only stuff in the preamble when needed. To illustrate suppose we have two graphics IB's 1 and 2, which are submitted in that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we have a context switch at the start of IB 1, but not between IB 1 and IB 2. The old code put the CE RAM loads in the preamble of IB 2. As the preamble of IB 1 does not have the loads and the preamble of IB 2 does not get executed, the old values are not load into CE RAM. Fix this by always restoring the entire CE RAM. v2: - Just load all descriptor set buffers instead of load and store the entire CE RAM. - Leave the ce_ram_dirty tracking in place for the non-preamble case. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Note: This commit differs from the one in master - `54f755fa0f` ("radeonsi: Reinitialize all descriptors in CE preamble.")	2016-06-15 09:29:13 +01:00
Emil Velikov	7bed792ebb	cherry-ignore: drop the "i965 bring back INTEL_PRECISE_TRIG" The commit that removes it isn't in branch, thus there's nothing to do here. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-15 09:29:13 +01:00
Samuel Iglesias Gonsálvez	7d5cdb7675	i965: Defeat the register stride checker in pull uniform messages. Pulling DF uniforms from pull constant buffer generates messages like: send(4) g12<1>DF g12<0,1,0>F sampler ld SIMD4x2 Surface = 1 Sampler = 0 mlen 1 rlen 1 which produces GPU hangs in Cherryview/Braswell: "For 64-bit Align1 operation or multiplication of dwords in CHV, source horizontal stride must be aligned to qword." This seems to be documented in the Cherryview PRM, Volume 7, Page 843: "When source or destination datatype is 64b or operation is integer DWord multiply, regioning in Align1 must follow these rules: 1. Source and Destination horizontal stride must be aligned to the same qword." We should set the destination type to UD, D, or F so that the register stride checker doesn't notice. The destination type of send messages is basically irrelevant anyway. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `a0ed8503b7`)	2016-06-15 09:29:13 +01:00
Kenneth Graunke	465be91421	i965: Defeat the register stride checker in URB reads. Pulling DF inputs from the URB generates messages like: send(8) g23<1>DF g1<8,8,1>UD urb 3 SIMD8 read mlen 1 rlen 2 { align1 1Q }; which makes the simulator angry: "For 64-bit Align1 operation or multiplication of dwords in CHV, source horizontal stride must be aligned to qword." This seems to be documented in the Cherryview PRM, Volume 7, Page 823: "When source or destination datatype is 64b or operation is integer DWord multiply, regioning in Align1 must follow these rules: 1. Source and Destination horizontal stride must be aligned to the same qword." Setting the source horizontal stride to QWord is insane, as it's the message header containing 8 URB handles in a single 32-bit DWord. Instead, we should whack the destination type to UD, D, or F so that the register stride checker doesn't notice. The destination type of send messages is basically irrelevant anyway. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `ed3ba651f6`)	2016-06-15 09:29:13 +01:00
Kenneth Graunke	4a6fecdf69	i965: Fix issues with number of VS URB entries on Cherryview/Broxton. Cherryview/Broxton annoyingly have a minimum number of VS URB entries of 34, which is not a multiple of 8. When the VS size is less than 9, the number of VS entries has to be a multiple of 8. Notably, BLORP programmed the minimum number of VS URB entries (34), with a size of 1 (less than 9), which is invalid. It seemed like this could be a problem in the regular URB code as well, so I went ahead and updated that to be safe. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `9f37df06da`)	2016-06-15 09:29:13 +01:00
Timothy Arceri	883a1b3bd2	glsl: make sure UBO arrays are sized in ES This check was removed in `5b2675093e` add it back in. Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> https://bugs.freedesktop.org/show_bug.cgi?id=96349 (cherry picked from commit `b010fa8567`)	2016-06-15 09:29:13 +01:00
Vedran Miletić	a71e0fd8cd	clover: Update OpenCL version string to match OpenGL Change MESA into Mesa in CL_PLATFORM_VERSION and CL_DEVICE_VERSION. For both, always append git version suffix from git_sha1.h. v5: move semicolon to same line as MESA_GIT_SHA1. v4: drop #ifdef guards. v3: add missing include. v2: change CL_DEVICE_VERSION as well. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> (cherry picked from commit `4825264f75`) Squashed with commit clover: Include generated sources in AM_CPPFLAGS git_sha1.c is generated in $(top_builddir)/src. Fixes out-of-tree builds since `4825264f75`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96516 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com> (cherry picked from commit `fafe026dbe`)	2016-06-15 09:29:13 +01:00
Francisco Jerez	547b5d2daa	i965/fs: Fix regs_written for SIMD-lowered instructions some more. ISTR having suggested this during review of the recent FP64 changes to the SIMD lowering pass, but it doesn't look like it was taken into account in the end. Using the fs_reg::component_size helper instead of this open-coded variant makes sure that the stride is taken into account correctly. Fixes at least the following piglit tests with spilling forced on (since otherwise regs_written would be calculated incorrectly and the spilling code would be rather confused about how much data needs to be spilled): spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `bd9f972651`)	2016-06-15 09:29:13 +01:00
Francisco Jerez	7154fa614b	i965: Fix cross-primitive scratch corruption when changing the per-thread allocation. I haven't found any mention of this in the hardware docs, but experimentally what seems to be going on is that when the per-thread scratch slot size is changed between two pipelined draw calls, shader invocations using the old and new scratch size setting may end up being executed in parallel, causing their scratch offset calculations to be based in a different partitioning of the scratch space, which can cause their thread-local scratch space to overlap leading to cross-thread scratch corruption. I've been experimenting with alternative workarounds, like emitting a PIPE_CONTROL with DC flush and CS stall between draw (or dispatch compute) calls using different per-thread scratch allocation settings, or avoiding reuse of the scratch BO if the per-thread scratch allocation doesn't exactly match the original. Both seem to be as effective as this workaround, but they have potential performance implications, while this should be basically for free. Fixes over 40 failures in our CI system with spilling forced on (including CTS, dEQP and Piglit failures) on a number of different platforms from Gen4 to Gen9. The 'glsl-max-varyings' piglit test seems to be able to reproduce this bug consistently in the vertex shader on at least Gen4, Gen8 and Gen9 with spilling forced on. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `a84b5d43e2`)	2016-06-15 09:29:13 +01:00
Francisco Jerez	2cf78b4851	i965: Keep track of the per-thread scratch allocation in brw_stage_state. This will be used to find out what per-thread slot size a previously allocated scratch BO was used with in order to fix a hardware race condition without introducing additional stalls or memory allocations. Instead of calling brw_get_scratch_bo() manually from the various codegen functions, call a new helper function that keeps track of the per-thread scratch size and conditionally allocates a larger scratch BO. v2: Handle BO allocation manually instead of relying on brw_get_scratch_bo (Ken). Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `d960284e44`)	2016-06-15 09:29:12 +01:00
Francisco Jerez	b9f69df93d	i965: Fix scratch overallocation if the original slot size was already a power of two. The bitwise arithmetic trick used in brw_get_scratch_size() to clamp the scratch allocation to 1KB has the unintended side effect that it will cause us to allocate 2x the required amount of scratch space if the original per-thread scratch size happened to be already a power of two. Instead use the obvious MAX2 idiom to clamp the scratch allocation to the expected range. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `013ae4a70a`)	2016-06-15 09:29:12 +01:00
Kenneth Graunke	eaa8561230	i965: Fix encode_slm_size() to take a generation, not a device info. In the Vulkan driver, we have the generation number (a compile time constant) but not necessarily the brw_device_info struct. I meant to rework the function to take a generation number instead of a brw_device_info pointer to accomodate this. But I forgot, and left it taking a brw_device_info pointer, while making Vulkan pass the generation number (8, 9, ...) directly. This led to crashes. Brown paper bag fix for commit `87d062a940`. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96504 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `5a0d294d38`)	2016-06-15 09:29:12 +01:00
Kenneth Graunke	9edc2f1828	i965: Don't leak scratch BOs for TCS/TES. These need to be freed too. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `667e5cec76`)	2016-06-15 09:29:12 +01:00
Nanley Chery	5e41ac197f	anv/pipeline: Don't dereference NULL dynamic state pointers Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts of pCreateInfo members are moved to the earliest points at which they should not be NULL. This fixes a segfault seen in the McNopper demo, VKTS_Example09. v3 (Jason Ekstrand): - Fix disabled rasterization check - Revert opaque detection of color attachment usage Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a4a5917248`)	2016-06-15 09:29:12 +01:00
Nanley Chery	cdeb3e8eb4	anv: Document and rename anv_pipeline_init_dynamic_state() To reduce confusion, clarify that the state being copied is not dynamic. This agrees with the Vulkan spec's usage of the term. Various sections specify that the various pipeline state which have VkDynamicState enums (e.g. viewport, scissor, etc.) may or may not be dynamic. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a0d84a9ef9`)	2016-06-15 09:29:12 +01:00
Samuel Pitoiset	0b71ef5e46	nvc0/ir: clamp the UBO index for compute on Kepler We already check that the address is not "too far", but we should also clamp the UBO index in order to avoid looking at the wrong place in the driver cb. This is a pretty rare situation though. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `7f257abc1b`)	2016-06-15 09:29:12 +01:00
Jimmy Berry	c01ebdc83e	st/va: hardlink driver instances to gallium_drv_video.so Removes the need to set LIBVA_DRIVER_NAME=gallium for supported targets and is consistent with vdpau and general gallium drivers. Note: some versions of libva can detect the gallium name and use the backend. Although that behaviour seems inconsistent since it only works for some platforms/backends. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `0c0f841e5d`)	2016-06-15 09:29:12 +01:00
Emil Velikov	501e8421f8	swr: automake: add missing -I flag When building from a release tarball (where the generated/built files are in srcdir) in an OOT fashion we need to have both builddir and srcdir in the includes list. Otherwise we'll error out, as the file (header gen_knobs.h in this case) won't be in the location where we are looking. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `fcb5a75a66`)	2016-06-15 09:29:12 +01:00
Emil Velikov	3162e2f9fc	automake: add SWR to `make distcheck' gallium drivers Will allows us to catch missing files and build issues before getting the tarball out for general consumption. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `f4d26856df`)	2016-06-15 09:29:11 +01:00
Emil Velikov	766f852616	configure.ac: strip out the llvm-config -march/mtune flags Otherwise drivers such as SWR that depend on providing their own values will fail to build. v2: Add -mcpu for good measure (Chuck) Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chuck Atkins <chuck.atkins@kitware.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com> (cherry picked from commit `bab5ab6940`)	2016-06-15 09:29:11 +01:00
Chuck Atkins	b499d1062d	swr: Add missing headers for package inclusion CC: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `c86fcaca72`)	2016-06-15 09:29:11 +01:00
Emil Velikov	939cd6edac	automake: get in-tree `make distclean' working again. With earlier commit we've handled the `make distclean' out of tree build, yet we failed to attribute that for in-tree builds the test condition will return 1. Thus effectively the target will be considered as "failed". Fixes: `b7f7ec7843` ("mesa: automake: distclean git_sha1.h when building OOT") Cc: <mesa-stable@lists.freedesktop.org> Tested-by: Andy Furniss <adf.lists@gmail.com> Reported-by: Andy Furniss <adf.lists@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `8229fe68b5`)	2016-06-15 09:29:11 +01:00
Kenneth Graunke	2b6817c91c	i965: Use the correct number of threads for compute shaders. We were programming the number of threads per subslice, when we should have been programming the total number of threads on the GPU as a whole. Thanks to Curro and Jordan for helping track this down! On Skylake GT3e: - Improves performance in Unreal's Elemental Demo by roughly 1.5-1.7x. - Improves performance in Synmark's Gl43CSDof by roughly 3.7x. - Improves performance in Synmark's Gl43GSCloth by roughly 1.18x. On Broadwell GT2: - Improves performance in Unreal's Elemental Demo by roughly 1.2-1.5x. - Improves performance in Synmark's Gl43CSDof by roughly 2.0x. - Improves performance in Synmark's Gl43GSCloth by 1.47035% +/- 0.255654% (n=25). On Haswell GT3e: - Improves performance in Unreal's Elemental Demo (in GL 4.3 mode) by roughly 1.10x. - Improves performance in Synmark's Gl43CSDof by roughly 1.18x. - Decreases performance in Synmark's Gl43CSCloth by -1.99484% +/- 0.432771% (n=64). On Ivybridge GT2: - Improves performance in Unreal's Elemental Demo (in GL 4.2 mode) by roughly 1.03x. - Improves performance in Synmark's G/43CSDof by roughly 1.25x. - No change in Synmark's Gl43CSCloth (n=28). Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `0fb85ac08d`)	2016-06-15 09:29:11 +01:00
Kenneth Graunke	9a118c79e7	i965: Assert that the scratch spaces are in range. I don't know that anything actually guarantees this, but if we exceed the limits, we may end up overflowing and trashing random buffers that happen to be nearby in the VMA space, leading to rendering corruption, hangs, or worse. We should really fix this properly. However, the pitfall has existed for ages, so for now we should at least detect it. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `1db37ebecf`)	2016-06-15 09:29:11 +01:00
Kenneth Graunke	be426c46ab	i965: Fix CS scratch size calculations on Ivybridge and Baytrail. These are linear, not powers of two, and much more limited. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `a42a93dc12`)	2016-06-15 09:29:11 +01:00
Kenneth Graunke	02f381bb17	i965: Fix Haswell CS per-thread scratch space encoding. Most scratch stages use power of two sizes, in kilobytes, where 0 means 1kB. But compute shaders on Haswell have a minimum of 2kB, and use a representation where 0 = 2kB. This meant that we were effectively telling the hardware to allocate each thread twice as much space as we meant to, while simultaneously not allocating that much space in the buffer, leading to overflows. Note that the existing code is completely wrong for Ivybridge, but that will take additional work to sort out, so I've left it as is for now. A subsequent commit will take care of that. Together with the previous patches, this fixes rendering corruption on Synmark's Gl43CSDof on Haswell. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `147a90d82a`)	2016-06-15 09:29:11 +01:00
Kenneth Graunke	e84116f364	i965: Account for poor address calculations in Haswell CS scratch size. Curro figured this out by investigating the simulator. Apparently there's also a workaround in the Windows driver. I'm not sure it's actually documented anywhere. We were underallocating the scratch buffer by a factor of 128/70. v2: Rename threads_per_subslice to scratch_ids_per_subslice (suggested by Jordan Justen). Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `a7d029d3df`)	2016-06-15 09:29:11 +01:00
Kenneth Graunke	6c5c1bc1b9	i965: Allocate scratch space for the maximum number of compute threads. We were allocating enough space for the number of threads per subslice, when we should have been allocating space for the number of threads in the entire GPU. Even though we currently run with a reduced thread count (due to a bug), we might still overflow the scratch buffer because the address calculation is based on the FFTID, which can depend on exactly which threads, EUs, and threads are executing. We need to allocate enough for every possible thread that could run. Fixes rendering corruption in Synmark's Gl43CSDof on Gen8+. Earlier platforms need additional bug fixes. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `2213ffdb4b`)	2016-06-15 09:29:10 +01:00
Kenneth Graunke	fdcc6a855b	i965: Set subslice_total on Gen7/7.5 platforms. We'll use this for compute shader thread counts and scratch space calculations shortly. Note that subslices are referred to as "half slices" on Ivybridge. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `9cd8f95809`)	2016-06-15 09:29:10 +01:00
Kenneth Graunke	c9477e0a80	i965: Fix shared local memory size for Gen9+. Skylake changes the representation of shared local memory size: Size \| 0 kB \| 1 kB \| 2 kB \| 4 kB \| 8 kB \| 16 kB \| 32 kB \| 64 kB \| ------------------------------------------------------------------- Gen7-8 \| 0 \| none \| none \| 1 \| 2 \| 4 \| 8 \| 16 \| ------------------------------------------------------------------- Gen9+ \| 0 \| 1 \| 2 \| 3 \| 4 \| 5 \| 6 \| 7 \| The old formula would substantially underallocate the amount of space. This fixes GPU hangs on Skylake when running with full thread counts. v2: Fix the Vulkan driver too, use a helper function, and fix the table in the comments and commit message. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `87d062a940`)	2016-06-15 09:29:10 +01:00
Ilia Mirkin	8d9bf67bba	mesa: add drawbuffer argument to ClearNamedFramebufferfi This was fixed in revision 47 of the ARB_dsa spec in Oct 22, 2015. Since it's horrible to have differing APIs across library versions, we should attempt to minimize the impact by backporting it as far as possible and hope no one notices. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `7d7e015381`)	2016-06-15 09:29:10 +01:00
Ilia Mirkin	ca009cf8ba	GL: update glcorearb.h to svn 32433 This brings in the fixed glClearNamedFramebufferfi definition, as well as a lot of GLsizei -> GLsizeiptr changes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `92351a71a8`)	2016-06-15 09:29:10 +01:00
Ilia Mirkin	7487d5cbdc	GL: update glext to svn 32957 This brings in defines from GL_EXT_window_rectangles and fixes the glClearNamedFramebufferfi definition. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `f81374fd3e`)	2016-06-15 09:29:10 +01:00
Anuj Phogat	72dbdf6f89	gallium: Fix region overlap conditions for rectangles with a shared edge >From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels": "The pixels corresponding to these buffers are copied from the source rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1) to the destination rectangle bounded by the locations (dstX0, dstY 0) and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive, while the upper bounds are exclusive." So, the rectangles sharing just an edge shouldn't overlap. ----------- \| \| ------- --- \| \| \| \| \| \| ------- --- Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `466b320163`)	2016-06-15 09:29:10 +01:00
Anuj Phogat	dd1943f904	mesa: Fix region overlap conditions for rectangles with a shared edge >From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels": "The pixels corresponding to these buffers are copied from the source rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1) to the destination rectangle bounded by the locations (dstX0, dstY 0) and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive, while the upper bounds are exclusive." So, the rectangles sharing just an edge shouldn't overlap. ----------- \| \| ------- --- \| \| \| \| \| \| ------- --- Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `f8679badd4`)	2016-06-15 09:29:10 +01:00
Jason Ekstrand	6eb0240a32	anv/entrypoints: Rework #if guards This reworks the #if guards a bit. When Emil originally wrote them, he just guarded everything. However, part of what anv_entrypoints_gen.py generates is a hash table for looking up entrypoints based on their name. This table cannot get out of sync between C and python regardless of preprocessor flags. In order to prevent this, this commit makes us use void pointers in the dispatch table for those entrypoints which aren't available. This means that the dispatch table size and entry order is constant and it should never get out-of-sync with the python. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `8d37556ec9`)	2016-06-15 09:29:10 +01:00
Jason Ekstrand	eb0197ad53	anv/entrypoints: Use the function pointer types provided by vulkan.h This is a bit cleaner than generating the types ourselves when making the table. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `9ed0d9dd06`)	2016-06-15 09:29:09 +01:00
Jason Ekstrand	242ac96a24	anv/entrypoints: Emit #if guards for all platforms Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `d1a53f91ee`)	2016-06-15 09:28:54 +01:00
Nicolai Hähnle	c03b4444d1	st/mesa: use base level size as "guess" when available When an applications specifies mip levels _before_ setting a mipmap texture filter, we will initially guess a single texture level. When the second level image is created, we try to allocate the full texture -- however, we get the base level size guess wrong if that size is odd. This leads to yet another re-allocation of the texture later during st_finalize_texture. Even worse, this re-allocation breaks a (reasonable) assumption made by st_generate_mipmaps, because the re-allocation in the finalization call will again allocate a single-level pipe texture (based on the non-mipmap texture filter!). As a result, mipmap generation fails in interesting ways. All of this can be avoided by just using the fact that we already know the size of the base level. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95529 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> (cherry picked from commit `42624ea837`)	2016-06-14 15:48:40 +01:00
Jason Ekstrand	ad684cee3a	anv: Remove the PhysicalDeviceLimits FINISHME At this point, the limits are probably more-or-less correct. If there is an invalid limit, that's a bug not a FINSHME. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a1e69930e4`)	2016-06-14 15:48:40 +01:00
Jason Ekstrand	ea24c9be4a	anv/pipeline_cache: Allow for an zero-sized cache This gets ANV_ENABLE_PIPELINE_CACHE=false working again. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `4f5bbf804b`)	2016-06-14 15:48:40 +01:00
Jason Ekstrand	86dbf1ef4b	anv/pipeline: Store the (set, binding, index) tripple in the bind map This way the the bind map (which we're caching) is mostly independent of the pipeline layout. The only coupling remaining is that we pull the array size of a binding out of the layout. However, that size is also specified in the shader and should always match so it's not really coupled. This rendering issues in Dota 2. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a1a25db699`)	2016-06-14 15:48:40 +01:00
Jason Ekstrand	b1f217b5a9	anv/descriptor_set: Ensure that bindings are always in increasing order Since applications are allowed to specify some set of bindings which need not be dense they also need not be in order. For most things, this doesn't matter, but it could result getting the wrong dynamic offsets. This adds a quick-and-dirty sort to ensure that everything is always in increasing order of binding index. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `c13c5ac561`)	2016-06-14 15:48:40 +01:00
Jason Ekstrand	a0be8d3d08	anv/descriptor_set: Add a type field in debug builds This allows for some extra validation and makes it easier to see what's going on when poking around in gdb. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `e2265926f2`)	2016-06-14 15:48:40 +01:00
Jason Ekstrand	901c78786f	anv/descriptor_set: Set array_size to zero for non-existant descriptors Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `cd21015abd`)	2016-06-14 15:48:40 +01:00
Leo Liu	986159437d	vl/dri3: support receiving new pixmap for front buffer With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets renewed in each frame, so when we receive a new pixmap, should get a new front buffer for it. This also fixes Totem player playback corruption. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `2ad443e4cc`)	2016-06-14 15:48:39 +01:00
Leo Liu	ab75b22029	vl/dri3: get Makefile properly From original commit, the macro "if HAVE_DRI3" was in Makefile.sources, this file is shared with SCons, SCons is not able to parse this marco, the SCons build failed. Jose quickly gave two approaches and quick fix with his second approach, thanks Jose for the solutions and fixes. This patch is Jose's first approach, and it's more proper, because the dri3 c file should not be included to build when DRI3 is not enabled. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `0ef8500aab`)	2016-06-14 15:48:39 +01:00
Daniel Czarnowski	5cae2ac47e	glx: fix crash with bad fbconfig GLX documentation states: glXCreateNewContext can generate the following errors: (...) GLXBadFBConfig if config is not a valid GLXFBConfig Function checks if the given config is a valid config and sets proper error code. Fixes currently crashing glx-fbconfig-bad Piglit test. v2: coding style cleanups (Emil, Topi) use DefaultScreen macro (Emil) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `cf804b4455`)	2016-06-14 15:48:39 +01:00
Jason Ekstrand	7d515b26bb	i965: Emit surface states for extra planes prior to gen8 When Kristian implemented GL_TEXTURE_EXTERNAL_OES, he hooked it up for gen8 but not for gen7 or earlier. It all works, we just need to emit the states for the extra planes. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `037ce5d734`)	2016-06-14 15:48:39 +01:00
Marc-André Lureau	0e554f54dc	virgl: fix checking fences When calling virgl_fence_wait() with timeout=0, virgl_{drm,vtest}_resource_is_busy() is called. However, it returns TRUE for a busy resource, whereace virgl_fence_wait() should return TRUE for a completed (non-busy) resource. This fixes running supertuxkart in a VM (I could not reproduce locally with vtest though there is a similar fix) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `dc81b3ad43`)	2016-06-14 15:48:39 +01:00
Nicolai Hähnle	201f357c52	st/mesa: directly compute level=0 texture size in st_finalize_texture The width0/height0/depth0 on stObj may not have been set at this point. Observed in a trace that set up levels 2..9 of a 2d texture, and set the base level to 2, with height 1. This made the guess logic always bail. Originally investigated by Ilia Mirkin, this patch gets rid of the somewhat redundant storage of width0/height0/depth0 and makes sure we always compute pipe texture sizes that are compatible with the base level image of the GL texture. Fixes the gl-1.2-texture-base-level piglit test provided by Brian Paul. v2: - try to re-use an existing pipe texture when possible - handle a corner case where the base level is not level 0 and it is of size 1x1x1 v3: - ptHeight = ptWidth in cube map 1x1 case (suggested by Brian) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> (cherry picked from commit `bd5c41fe5f`)	2016-06-14 15:48:39 +01:00
Ilia Mirkin	bf3d6d9601	st/mesa: use buffer usage history to set dirty flags for revalidation We were previously unconditionally doing this for arrays and ubo's, and ignoring texture/storage/atomic buffers. Instead use the usage history to determine which atoms need to be revalidated. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `6e6fd911da`)	2016-06-14 15:48:39 +01:00
Marek Olšák	f51e99f704	gallium/radeon: don't allocate DCC for non-renderable texture formats R9G9B9E5 is the only uncompressed one hopefully. This fixes incorrect rendering not discovered (due to a lack of tests) until DCC mipmapping was enabled. Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `d4d733e39d`)	2016-06-14 15:48:39 +01:00
Nicolai Hähnle	b2afa23a40	tgsi/scan: add uses_derivatives (v2) v2: - TG4 does not calculate derivatives (Ilia) - also handle SAMPLE* instructions (Roland) Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (cherry picked from commit `d3a584defe`)	2016-06-14 15:48:39 +01:00
Ilia Mirkin	6f38259419	st/mesa: revalidate image atoms when a texture is updated A texture may be redefined with _NEW_TEXTURE, which might have been bound to a shader image slot. We have to revalidate the image atoms to pick up on the new resource. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `c81b090c92`)	2016-06-14 15:48:39 +01:00
Ilia Mirkin	bba2299735	gk104/ir: fix conditions for adding a texbar Sometimes a register source can actually be double- or even quad-wide. We must make sure that the inserted texbars take that width into account. Based on an earlier patch by Samuel Pitoiset. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `71ad8a173f`)	2016-06-14 15:48:38 +01:00
Dave Airlie	49c53a2987	i965/gen8: fix cull distance emission for tessellation shaders. This fixes some cases of: GL45-CTS.cull_distance.functional on Skylake. Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `c295923d13`)	2016-06-14 15:48:38 +01:00
Samuel Pitoiset	4306e01ece	nv50/ir: use round toward 0 when converting doubles to integers Like floats, we should use the round toward 0 mode instead of the nearest one (which is the default) for doubles to integers. This fixes all arb_gpu_shader_fp64 piglits which convert doubles to integers (16 tests). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `08ddfe7b2f`)	2016-06-14 15:48:38 +01:00
Dave Airlie	b9920d2bba	mesa/program_resource: return -1 for index if no location. The GL4.5 spec quote seems clear on this: "The value -1 will be returned by either command if an error occurs, if name does not identify an active variable on programInterface, or if name identifies an active variable that does not have a valid location assigned, as described above." This fixes: GL45-CTS.program_interface_query.output-built-in [airlied: use _mesa_program_resource_location_index as suggested by Eduardo] Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `07403014c3`)	2016-06-14 15:48:38 +01:00
Nicolai Hähnle	9bf30be693	radeonsi: set descriptor dirty mask on shader buffer unbind Found randomly while skimming the code. This might have caused VM faults in robustness tests. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `ec2b52e2d9`)	2016-06-14 15:48:38 +01:00
Samuel Iglesias Gonsálvez	05d33806cd	i965/gs/scalar: Fix load input for doubles Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `2b648ec17c`)	2016-06-14 15:48:38 +01:00
Samuel Iglesias Gonsálvez	507d25f6f1	i965/fs: fix offset when loading double vector input varyings When we are not packing a double input varying, we might need to read its data in a non-aligned to 64-bit offset, so we read the wrong data. This is happening when using explicit locations in varyings because Mesa disables packing varying for that case. const_index is in 32-bit size units but offset() is multiplying it by destination type size units. When operating with double input varyings, const_index value could be not aligned to 64 bits. To fix it, we load the double vector as if it was a float based vector with twice the number of components. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `2d6f82a294`)	2016-06-14 15:48:38 +01:00
Samuel Iglesias Gonsálvez	4daa331e25	i965/fs: fix FS_OPCODE_CINTERP for unpacked double input varyings Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not 64-bit aligned and the current implementation fails to read the data properly. Instead, when there is is a double input varying, read it as vector of floats with twice the number of components. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `cb30727648`)	2016-06-14 15:48:38 +01:00
Dave Airlie	fc0a469e4c	glsl: geom shader max_vertices layout must match. From GLSL 4.5 spec, "4.4.2.3 Geometry Outputs". "all geometry shader output vertex count declarations in a program must declare the same count." Fixes: GL45-CTS.geometry_shader.output.conflicted_output_vertices_max Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `4c86399378`)	2016-06-14 15:48:38 +01:00
Dave Airlie	0ce3dc9a30	i965: don't use NumLayers for 3D textures. For 3D textures we shouldn't be using NumLayers, we need to get it from the depth. This fixes: GL45-CTS.geometry_shader.layered_framebuffer.clear_call_support Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `ff2e569153`)	2016-06-14 15:48:37 +01:00
Dave Airlie	89bc5f9a90	glsl: for anonymous struct matching use without_array() (v3) With tessellation shaders we can have cases where we have arrays of anon structs, so make sure we match using without_array(). Fixes: GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in v2: test lengths match as well (Ilia) v3: descend array lengths to check for matches as well (Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `1f66a4b689`)	2016-06-14 15:48:37 +01:00
Dave Airlie	09f48203c5	glsl/ast: don't crash when func_name is NULL This fixes a crash in GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types If we can't find the func_name in one of these paths, we have emitted an earlier error so just return here. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `6702c15810`)	2016-06-14 15:48:37 +01:00
Dave Airlie	997bcc45ec	glsl: handle ast_aggregate in has_sequence_subexpression. (v2) GL43-CTS.compute_shader.work-group-size does uniform uint g_uniform[gl_WorkGroupSize.z + 20] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 }; The initializer triggers the GLSL 4.30/GLES3 tests for constant sequence subexpressions, so it doesn't happen unless you are using those, so just return false as this path is now reachable. v2: update commit msg with diagnosis Acked-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `4336196b7f`)	2016-06-14 15:48:37 +01:00
Ilia Mirkin	a0e36438a8	nv50,nvc0: fix BGR10_A2UI vertex format This is mostly academic as this is not reachable from GL, which only has the packed RGB10_A2UI vertex format. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `092ec3920f`)	2016-06-14 15:48:37 +01:00
Samuel Pitoiset	954829ebbb	nvc0: do not clear surfaces bins in the validate function We should not call nouveau_bufctx_reset() inside a validate function. This only affects Fermi where images are aliased between 3D and CP. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `be365f34f0`)	2016-06-14 15:48:37 +01:00
Samuel Pitoiset	f12a16ec99	nvc0: re-validate images after launching a grid on Fermi Images invalidation is a bit weird on Fermi and there is already a hack which forces invalidating all images when launching a computer shader to help in fixing 3D<->CP interaction. However, we need to re-validate images for compute because nvc0_compute_invalidate_surfaces() will destroy the previous binding. This is not really good for performance purposes but this might be improved later. This fixes the following piglits: - spec/arb_compute_shader/execution/basic-uniform-access - spec/arb_compute_shader/execution/mutiple-texture-reading - spec/arb_compute_shader/execution/multiple-workgroups - spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `43d3ecfb33`)	2016-06-14 15:48:37 +01:00
Ilia Mirkin	ceb9ed0e38	nvc0: reduce overhead from always marking images dirty We would revalidate images when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the images. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `fd6bbc2ee2`)	2016-06-14 15:48:37 +01:00
Ilia Mirkin	5a63ae9f15	nvc0: reduce overhead from always marking buffers dirty We would revalidate buffers when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the SSBOs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `0f673db6f0`)	2016-06-14 15:48:37 +01:00
Ilia Mirkin	a95560bac5	nvc0: fix memory barrier flag handling Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `e8ee161b16`)	2016-06-14 15:48:37 +01:00
Ilia Mirkin	1adbe2f45c	nvc0: mark bound buffer range valid Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `29abbeecd8`)	2016-06-14 15:48:37 +01:00
Marek Olšák	ccc9783a98	r600g: write WAIT_UNTIL in the correct place This has been wrong all along. Fixing this will allow removing useless cache flushes. Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (cherry picked from commit `7746903d3a`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	c632590996	anv/blit: Use CLAMP_TO_EDGE for scaled blits When upscaling you can end up interpolating between the edge pixel and one past the edge. Using CLAMP_TO_EDGE seems like the most reasonable thing to do in this case. This fixes two of the new Vulkan CTS tests in dEQP-VK.api.copy_and_blit.blit_image.* Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `441194edd9`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	2830ae638c	anv/copy: Account for the anv_surface.offset when creating a blit2d_surf This was causing problems if the user tried to copy to/from the stencil portion of a combined depth/stencil image. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `9313a56816`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	5ca18b6a4b	nir/spirv: Make a decoration switch complete Getting rid of the default case makes the compiler warn if we are missing cases. While we're here, we also add the one missing case. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `526a8de22d`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	bb4ff53a71	nir/spirv: Make unhandled decorations and capabilities non-fatal glslang frequently throw bogus decorations into shaders. While we are free to assert-fail, it's a bit nicer to the application to just warn. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `62c6e94bd6`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	d5dc87a1ef	nir/spirv: Add a way to print non-fatal warnings Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `ed14d21d04`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	85579221a4	nir/spirv: Add string lookup tables for a couple of SPIR-V enums Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `2e46a5d155`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	6eb39fa255	nir/spirv: Complete the list of capabilities Previously we supported a subset of capabilities and just left a default case for the others. It's time to stop being lazy and actually audit the capabilities. This should bring them up-to-date with reality. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `5a1e56f344`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	13999dc70d	anv/pipeline: Add support for early depth stencil Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `9fa958e95b`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	a3fce26907	i965/fs Add a wm_prog_data bit for has_side_effects This is more accurate than calling _mesa_active_fragment_shader_has_side_effects because it looks at whether or not the SSBOs, images, or atomic buffers are actually written rather than just existing in the program. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `3fb289f957`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	7ddbe91435	anv/pipeline: Silently pass tests if depth or stencil is missing Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `56a178922f`)	2016-06-14 15:48:36 +01:00
Jason Ekstrand	c6ca6f0728	anv/pipeline: Unify gen7/8 emit_ds_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `bc7f7e1953`)	2016-06-14 15:48:35 +01:00
Jason Ekstrand	300737042c	genxml/gen6,7,75: s/BackFace/Backface This is more consistent with gen8+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `fdc3c5dd05`)	2016-06-14 15:48:35 +01:00
Jason Ekstrand	09b2be7a51	nir/spirv: Handle the WorkgroupSize builtin decoration This fixes the 7 dEQP-VK.pipeline.spec_constant.compute.local_size.* tests in the latest dev version of the Vulkan CTS. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `1f7b54ed29`)	2016-06-14 15:48:35 +01:00
Jason Ekstrand	6efd37f30d	nir/spirv: Use breaks instead of returns in constant handling Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b26cdd65e8`)	2016-06-14 15:48:35 +01:00
Jason Ekstrand	af2a278dfe	anv/pipeline: Refactor specialization constant handling a bit Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a19ae36ce5`)	2016-06-14 15:48:35 +01:00
Jason Ekstrand	07f5e621cf	nir/lower_indirect_derefs: Use the direct array deref for recursion This fixes about 100 of the new Vulkan CTS tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `45542f554c`)	2016-06-14 15:48:35 +01:00
Jason Ekstrand	d0dddbf4ee	anv/clear: Handle ClearImage on 3-D images Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `59f06ac389`)	2016-06-14 15:48:35 +01:00
Francisco Jerez	cbb02ebd74	Revert "i965/fs: Allow scalar source regions on SNB math instructions." This reverts commit `c1107cec44`. Apparently the hardware spec text I quoted in the commit message was outright lying about scalar source math being supported on SNB, the hardware seems to load 32 contiguous bits of data for each channel regardless of the regioning mode. Fixes regressions in the following CTS tests (which we didn't catch early due to CTS being temporarily disabled in our CI system): es2-cts.gtf.gl.atan.atan_vec3_frag_xvary es2-cts.gtf.gl.cos.cos_vec2_frag_xvary es2-cts.gtf.gl.atan.atan_vec2_frag_xvary es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf es2-cts.gtf.gl.cos.cos_float_frag_xvary es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf es2-cts.gtf.gl.cos.cos_vec3_frag_xvary es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346 Reported-by: Mark Janes <mark.a.janes@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `7244dc1e06`)	2016-06-14 15:48:35 +01:00
Francisco Jerez	d5528d5f55	i965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N). The conditional mod of these instructions determines the semantics of the comparison itself (rather than being evaluated based on the result of the instruction as is usually the case for most other instructions that allow conditional mods), so it's in general not legal to propagate a conditional mod into a CMP instruction. This prevents cmod propagation from (mis)optimizing: cmp.z.f0 tmp, ... mov.z.f0 null, tmp into: cmp.z.f0 tmp, ... which gives the negation of the flag result of the original sequence. I originally noticed this while working on SIMD32 in the scalar back-end, but the same scenario is likely to be possible in vec4 programs so this commit ports the bugfix with the same name from the scalar back-end to the vec4 cmod propagation pass. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `a2135c6fd9`)	2016-06-14 15:48:35 +01:00
Emil Velikov	0e540b4a15	anv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGS Otherwise we will fail to find the headers in some scenarios. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> (cherry picked from commit `7a3a0d9212`)	2016-06-14 15:48:35 +01:00
Dave Airlie	911eddd37b	mesa/get: return correct value for layer provoking vertex. This fixes: GL45-CTS.geometry_shader.layered_rendering.layered_rendering on Skylake. Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `d10ae20b96`)	2016-06-10 12:36:36 +01:00
Samuel Pitoiset	2185edf699	nvc0: mark buffer texture range valid for shader images Loosely based on radeonsi (Thanks to Nicolai). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `28590eb949`)	2016-06-10 12:35:15 +01:00
Francisco Jerez	09f0e97d1c	i965/fs: Reindent emit_zip(). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `060c8d245d`)	2016-06-10 12:34:19 +01:00
Francisco Jerez	2db670cf3e	i965/fs: Skip SIMD lowering destination zipping if possible. Skipping the temporary allocation and copy instructions is easy (just return dst), but the conditions used to find out whether the copy can be optimized out safely without breaking the program are rather complex: The destination must be exactly one component of at most the execution width of the lowered instruction, and all source regions of the instruction must be either fully disjoint from the destination or be aligned with it group by group. v2: Don't handle partial source-destination overlap for simplicity (Jason). No instruction count regressions with respect to v1 in either shader-db or the few FP64 shader_runner test-cases with partial overlap I've checked manually. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `7aa76d66a1`)	2016-06-10 12:33:14 +01:00
Anuj Phogat	5000556d5d	blorp: Fix 16x multisample scaled blits Piglit test ext_framebuffer_multisample_blit_scaled-blit-scaled (with added 16x sample support) now passes with this patch. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `75da9c9933`)	2016-06-10 12:32:17 +01:00
Dave Airlie	ab525a637a	mesa/copyimage: report INVALID_VALUE for missing cube face The specs says INVALID_VALUE for exceeding dimensions, which is really what is happening here. This fixes: GL45-CTS.copy_image.non_existent_mipmap Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `af7bf610cf`)	2016-06-10 12:31:22 +01:00
Dave Airlie	d130c53ac1	mesa/copyimage: fix num samples check to handle renderbuffers. This test was only happening for textures, but there is nothing in the spec to say this, so test it for all cases. This fixes: GL45-CTS.copy_image.invalid_target Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `c0856eacf1`)	2016-06-10 12:30:23 +01:00
Nanley Chery	669836e1be	mesa/extensions: Fix ES1 extension reporting Commit `eda15abd84` , unintentionally advertised these extensions in ES1 contexts. Undo this error. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `c06cef7f9b`)	2016-06-10 12:26:10 +01:00
Eric Engestrom	1dce03e4c1	st/osmesa: remove double-write (overwriting) These two lines have been here since the file was created. I'm guessing the second one was just for testing during dev, so it's the one that's going away. CoverityID: 1296205 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> (cherry picked from commit `17f4c723eb`)	2016-06-10 12:08:02 +01:00
Emil Velikov	a7649abe9f	Update version to 12.0.0-rc2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:35:59 +01:00
Emil Velikov	bcfda0a1fe	mesa: automake: distclean git_sha1.h when building OOT In the case of out-of-tree (OOT) builds, in particular when building from tarball, we'll end up with the file in both srcdir and builddir. We want the former to remain intact (since we need it on rebuild) while the latter should be removed otherwise `make distclean' gets angry at us. Ideally there'll be a solution that feels a bit less of a hack. Until then this does the job exactly as expected. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b7f7ec7843`)	2016-06-07 12:35:53 +01:00
Emil Velikov	998e503592	mesa: automake: ensure that git_sha1.h.tmp has the right attributes ... when copied from git_sha1.h. As the latter file can we lacking the write attribute, one should set it explicitly. Otherwise we'll get a warning/failure at cleanup stage. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2c424e00c3`)	2016-06-07 12:35:50 +01:00
Emil Velikov	5e3e292502	mesa: automake: add directory prefix for git_sha1.h Otherwise the build will assume that we've talking about builddir, which is not the case in the else statement. Here the file is already generated and is part of the tarball. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `359d9dfec3`)	2016-06-07 12:35:46 +01:00
Emil Velikov	3be5c6a9ec	egl: android: don't add the image loader extension for !render_node With earlier commit we introduced support for render_node devices, which was couples with the use of the image loader extension. As the work was inspired by egl/wayland we (erroneously) added the extension for the !render_node path as well. That works for wayland, as the implementations of the DRI2 and IMAGE loader extensions converge behind the scenes. As that is not yet the case for Android we shouldn't expose the extension. Fixes: `34ddef39ce` ("egl: android: add dma-buf fd support") Cc: <mesa-stable@lists.freedesktop.org> Reported-by: Mauro Rossi <issor.oruam@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `1816c837c1`)	2016-06-07 12:35:40 +01:00
Emil Velikov	a26ca04fe3	anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards The generated sources should follow the example set by the vulkan headers and our non-generated code. Namely: the code for all supported platforms should be available, each one guarded by its respective VK_USE_PLATFORM_*_KHR macro. v2: Reword commit message. Cc: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96285 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1 over IRC) (cherry picked from commit `b8e1f59d62`)	2016-06-03 01:44:56 +01:00
Mauro Rossi	1a5d6a232f	isl: add support for Android libmesa_isl static library isl library is needed to build i965, libmesa_isl static library is added to fix related Android building errors. Any attempt to build libmesa_genxml as phony package module failed to deliver gen{7,75,8,9}_pack.h generated headers, needed for libmesa_isl_gen{7,75,8,9} Due to constraints in Android Build System, libmesa_genxml is built as static, at least one source is needed, so dummy.c is autogenerated for this scope, libmesa_genxml dependency is declared using LOCAL_WHOLE_STATIC_LIBRARIES, to avoid building errors due to missing genxml/gen{7,75,8,9}_pack.h headers. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `278c2212ac`)	2016-06-02 22:35:29 +01:00
Mauro Rossi	702a1121c9	android: libmesa_glsl: add a dependency on libmesa_nir static Fixes the following building error: target C++: libmesa_glsl <= external/mesa/src/compiler/glsl/glsl_to_nir.cpp In file included from external/mesa/src/compiler/glsl/glsl_to_nir.h:28:0, from external/mesa/src/compiler/glsl/glsl_to_nir.cpp:28: external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory compilation terminated. build/core/binary.mk:432: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o' failed make: * [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o] Error 1 make: * Waiting for unfinished jobs.... Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `4143245c23`)	2016-06-02 22:35:29 +01:00
Emil Velikov	9a21315ea9	isl: automake: don't include isl_format_layout.c in two lists. Including the file in both ISL_FILES and ISL_GENERATED_FILES makes the actual dependency list less obvious. v2: Drop unrelated vulkan hunk (Jason). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `af1a0ae8ce`)	2016-06-02 22:35:29 +01:00
Emil Velikov	94630ce0c7	automake: bring back the .PHONY git_sha1.h.tmp rule With earlier commit `3689ef32af` ("automake: rework the git_sha1.h rule, include in tarball") we/I erroneously removed the PHONY rule and the temporary file. The former is used to ensure that the header is regenerated when on each make invocation, while the latter helps us avoid the unneeded rebuild(s) when the SHA1 hasn't changed. Reported-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `af2637aa32`)	2016-06-02 22:35:29 +01:00
Christian König	6ad61d90ea	radeon/uvd: fix the H264 level for Tonga v2 We support 5.2 for a while now. v2: we even support 5.2 for H264, 5.1 is for HEVC. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b3e75c3997`)	2016-06-02 14:04:14 +01:00
Jordan Justen	a136b8bfe2	i965: Remove old CS local ID handling The old method pushed data for each channels uvec3 data of gl_LocalInvocationID. The new method pushes 1 dword of data that is a 'thread local ID' value. Based on that value, we can generate gl_LocalInvocationIndex and gl_LocalInvocationID with some calculations. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `0a3acff5b5`)	2016-06-02 14:02:05 +01:00
Jordan Justen	52ba7abe1e	i965: Enable cross-thread constants and compact local IDs for hsw+ The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. One complication is that cross-thread constants are loaded into registers before per-thread constants. Previously, our local IDs were loaded before the uniform data and treated as 'payload' data, even though they were actually pushed into the registers like the other uniform data. Therefore, in this patch we simultaneously enable a newer layout where each thread now uses a single uniform slot for a unique local ID for the thread. This uniform is handled specially to make sure it is added last into the uniform push constant registers. This minimizes our usage of push constant registers, and maximizes our ability to use cross-thread constants for registers. To swap from the old to the new layout, we also need to flip some lowering pass switches to let our driver handle the lowering instead. We also no longer force thread_local_id_index to -1. v4: * Minimize size of patch that switches from the old local ID layout to the new layout (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `b1f22c6317`)	2016-06-02 14:01:31 +01:00
Jordan Justen	28ecf2b90e	anv: Support new local ID generation & cross-thread constants The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. We also support per-thread data which allows us to store a per-thread ID in one of the uniforms that can be used to calculate the gl_LocalInvocationIndex and gl_LocalInvocationID variables. v4: * Support the old local ID push constant layout as well (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `3ba9594f32`)	2016-06-02 14:01:04 +01:00
Jordan Justen	ead833a395	i965: Support new local ID push constant & cross-thread constants The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. We also support per-thread data which allows us to store a per-thread ID in one of the uniforms that can be used to calculate the gl_LocalInvocationIndex and gl_LocalInvocationID variables. v4: * Support the old local ID push constant layout as well (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `30685392e0`)	2016-06-02 13:59:04 +01:00
Jordan Justen	ee77c4a099	i965: Add CS push constant info to brw_cs_prog_data We need information about push constants in a few places for the GL driver, and another couple places for the vulkan driver. When we add support for uploading both a common (cross-thread) set of push constants, combined with the previous per-thread push constant data, things are going to get even more complicated. To simplify things, we add push constant info into the cs prog_data struct. The cross-thread constant support is added as of Haswell. To support it we need to make sure all push constants with uniform values are added to earlier registers. The register that varies per thread and holds the thread invocation's unique local ID needs to be added last. For now we add the code that would calculate cross-thread constatn information for hsw+, but we force it (cross_thread_supported) off until the other parts of the driver support it. v4: * Support older local ID push constant layout as well. (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `d437798ace`)	2016-06-02 13:56:54 +01:00
Jordan Justen	a94be40ecc	i965: Store number of threads in brw_cs_prog_data Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `1b79e7ebbd`)	2016-06-02 13:54:44 +01:00
Jordan Justen	632d7ef148	i965: Add nir based intrinsic lowering and thread ID uniform We add a lowering pass for nir intrinsics. This pass can replace nir intrinsics with driver specific nir lower code. We lower the gl_LocalInvocationIndex intrinsic based on a uniform which is loaded with a thread specific ID. We also lower the gl_LocalInvocationID based on gl_LocalInvocationIndex. v2: * Create variable during lowering pass. (Ken) v3: * Don't create a variable, but instead just insert an intrisic call to load a uniform from the allocated location. (Jason) v4: * Don't run this pass if thread_local_id_index < 0 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `3ef0957dac`)	2016-06-02 13:53:58 +01:00
Jordan Justen	5513300f59	i965: Put CS local thread ID uniform in last push register This thread ID uniform will be used to compute the gl_LocalInvocationIndex and gl_LocalInvocationID values. It is important for this uniform to be added in the last push constant register. fs_visitor::assign_constant_locations is updated to make sure this happens. The reason this is important is that the cross-thread push constant registers are loaded first, and the per-thread push constant registers are loaded after that. (Broadwell adds another push constant upload mechanism which reverses this order, but we are ignoring this for now.) v2: * Add variable in intrinsics lowering pass * Make sure the ID is pushed last in assign_constant_locations, and that we save a spot for the ID in the push constants v3: * Simplify code based with Jason's suggestions. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `04fc72501a`)	2016-06-02 13:53:24 +01:00
Jordan Justen	33d0016836	i965: Add uniform for a CS thread local base ID v4: * Force thread_local_id_index to -1 for now, and have fs_visitor::setup_cs_payload look at thread_local_id_index. This enables us to more easily cut over from the old local ID layout to the new layout, as suggested by Jason. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `fa279dfbf0`)	2016-06-02 13:51:11 +01:00
Jordan Justen	169b700dfd	i965: Add nir channel_num system value v2: * simd16/32 fixes (curro) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `8f48d23e0f`)	2016-06-02 13:48:20 +01:00
Jordan Justen	33e985f8b9	nir: Make lowering gl_LocalInvocationIndex optional Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `6f316c9d86`)	2016-06-02 13:45:29 +01:00
Jordan Justen	c9de6190a0	glsl: Add glsl LowerCsDerivedVariables option v2: * Move lower flag to context constants. (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `7b9def3583`)	2016-06-02 13:38:06 +01:00
Jason Ekstrand	05d88165d9	i965/fs: Copy the offset when lowering logical pull constant sends This fixes 64 Vulkan CTS tests per gen Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96299 Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `1205999c22`)	2016-06-02 13:37:30 +01:00
Dave Airlie	d1cf18497a	glsl/distance: make sure we use clip dist varying slot for lowered var. When lowering, we always want to use the clip dist varying. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `8d4f4adfbd`)	2016-06-02 13:36:45 +01:00
Kenneth Graunke	5a44d36b46	i965: Fix isoline reads in scalar TES. Isolines aren't reversed. commit `5b2d8c2273` fixed this for the vec4 TES backend, but not the scalar one. Found while debugging GL45-CTS.tessellation_shader. tessellation_control_to_tessellation_evaluation.gl_tessLevel. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `25e1b8d366`)	2016-06-02 13:36:14 +01:00
Ian Romanick	0e54eebeed	glsl: Use Geom.VerticesOut == -1 to specify unset Because apparently layout(max_vertices=0) is a thing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a428c955ce`)	2016-06-02 13:35:18 +01:00
Ian Romanick	0ab1a3957a	i965: If control_data_header_size_bits is zero, don't do EndPrimitive This can occur when max_vertices=0 is explicitly specified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `b27dfa5403`)	2016-06-02 13:34:43 +01:00
Ian Romanick	1398a9510f	mesa: Fix bogus strncmp The string "[0]\0" is the same as "[0]" as far as the C string datatype is concerned. That string has length 3. strncmp(s, length_3_string, 4) is the same as strcmp(s, length_3_string), so make it be strcmp. v2: Not the same as strncmp(..., 3). Noticed by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `049bb94d2e`)	2016-06-02 13:33:53 +01:00
Ilia Mirkin	b265796c79	nir: allow sat on all float destination types With the introduction of fp64 and fp16 to nir, there are now a bunch of float types running around. A F1 2015 shader ends up with an i2f.sat operation, which has a nir_type_float32 destination. Allow sat on all the float destination types. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `ca135a2612`)	2016-06-02 13:32:52 +01:00
Alex Deucher	4a00da1662	radeonsi: fix the raster config setup for 1 RB iceland chips I didn't realize there were 1 and 2 RB variants when this code was originally added. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `bd85e4a041`)	2016-06-02 13:32:05 +01:00
Dave Airlie	e817522728	mesa/sampler: fix error codes for sampler parameters. The initial ARB_sampler_objects spec had GL_INVALID_VALUE in it, however version 8 of it fixed this, and the GL specs also have the fixed value in them. Fixes: GL45-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `6400144041`)	2016-06-02 13:31:18 +01:00
Dave Airlie	915cc490d7	glsl: define some GLES3 constants in GLSL 4.1 The GLSL 4.1 spec adds: gl_MaxVertexUniformVectors gl_MaxFragmentUniformVectors gl_MaxVaryingVectors This fixes: GL45-CTS.gtf31.GL3Tests.uniform_buffer_object.uniform_buffer_object_build_in_constants Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `0ebf4257a3`)	2016-06-02 13:30:24 +01:00
Topi Pohjolainen	683c6940d8	i965: Add norbc debug option This INTEL_DEBUG option disables lossless compression (also known as render buffer compression). v2: (Matt) Use likely(!lossless_compression_disabled) instead of !likely(lossless_compression_disabled) (Grazvydas) Update docs/envvars.html Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `6ca118d2f4`)	2016-06-02 13:28:22 +01:00
Topi Pohjolainen	2d483256d5	i965/gen9: Configure rbc buffers as plain for non-rbc tex views Fixes rendering in Shadow of Mordor with rbc. Application writes RGBA_UNORM texture filling it with values the application wants to later on treat as SRGB_ALPHA. Intel driver enables lossless compression for the buffer by the time of writing. However, the driver fails to make sure the buffer can be sampled as something else later on and unfortunately there is restriction in the hardware for using lossless compression for srgb formats which looks to extend itself to the sampling engine also. Requesting srgb to linear conversion on top of compressed buffer results the color values to be pretty much garbage. Fortunately none of tracked benchmarks showed a regression with this. v2 (Matt): Add missing space Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `30e9e6bd07`)	2016-06-02 13:27:53 +01:00
Kenneth Graunke	8c627af1f0	i965: Fix the passthrough TCS for isolines. We weren't setting up several of the uniform values for the patch header, so we'd crash when uploading push constants. We at least need to initialize them to zero. We also had the isoline parameters reversed, so it would also render incorrectly (if it didn't crash). Fixes a new Piglit test() (isoline-no-tcs), as well as crashes in GL44-CTS.tessellation_shader.single.max_patch_vertices. () https://lists.freedesktop.org/archives/piglit/2016-May/019866.html Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `a3dc99f3d4`)	2016-06-02 13:27:23 +01:00
Dave Airlie	86e367a572	i965/xfb: skip components in correct buffer. The driver was adding the skip components but always for buffer 0. This fixes: GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_skip_multiple_buffers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `ebb81cd683`)	2016-06-02 13:26:50 +01:00
Dave Airlie	64015c03bb	glsl/linker: fix multiple streams transform feedback. `e2791b38b4` mesa/program_interface_query: fix transform feedback varyings. caused a regression in GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_multiple_streams on radeonsi. The problem was it was using the skip components varying to set the stream id, when it should wait until a varying was written, this just adds the varying checks in the right place. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `1fe7bbb911`)	2016-06-02 13:25:59 +01:00
Dave Airlie	99fcfd985e	mesa/bufferobj: use mapping range in BufferSubData. According to GL4.5 spec: An INVALID_OPERATION error is generated if any part of the speci- fied buffer range is mapped with MapBufferRange or MapBuffer (see sec- tion 6.3), unless it was mapped with MAP_PERSISTENT_BIT set in the Map- BufferRange access flags. So we should use the if range is mapped path. This fixes: GL45-CTS.buffer_storage.map_persistent_buffer_sub_data Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "12.0, 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `e891f7cf55`)	2016-06-02 13:25:08 +01:00
Ilia Mirkin	7bc29c784a	nv50/ir: fix error finding free element in bitset in some situations This really only hits for bitsets with a size of a multiple of 32. We can end up with pos = -1 as a result of the ffs, which we in turn decide is a valid position (since we fall through the loop and i == 1, we end up adding 32 to it, so end up returning 31 again). Up until recently this was largely unreachable, as the register file sizes were all 63 or 255. However with the advent of compute shaders which can restrict the number of registers, this can now happen. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `18d11c9989`)	2016-06-02 13:24:08 +01:00
Timothy Arceri	b2b7f05da6	Revert "glsl: fix xfb_offset unsized array validation" This reverts commit `aac90ba292`. The commit caused a regression in: piglit.spec.glsl-1_50.compiler.gs-input-nonarray-named-block.geom Also the CTS test it was meant to fix seems like it may be bogus. Cc: "12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `98d40b4d11`)	2016-06-02 13:21:36 +01:00
Francisco Jerez	eb56a2f250	i965/fs: Allow scalar source regions on SNB math instructions. I haven't found any evidence that this isn't supported by the hardware, in fact according to the SNB hardware spec: "The supported regioning modes for math instructions are align16, align1 with the following restrictions: - Scalar source is supported. [...] - Source and destination offset must be the same, except the case of scalar source." Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `c1107cec44`)	2016-06-02 13:20:45 +01:00
Francisco Jerez	c1269825cf	i965/fs: Fix constant combining for instructions that cannot accept source mods. This is the case for SNB math instructions so we need to be careful and insert the literal value of the immediate into the table (rather than its absolute value) if the instruction is unable to invert the sign of the constant on the fly. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `06d8765bc0`)	2016-06-02 13:20:15 +01:00
Francisco Jerez	f651a4bb2e	i965/fs: Extend remove_duplicate_mrf_writes() to handle non-VGRF to MRF copies. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `303ec22ed6`)	2016-06-02 13:19:41 +01:00
Francisco Jerez	44029d4237	i965/fs: Fix compute_to_mrf() to coalesce VGRFs initialized by multiple single-GRF writes. Which requires using a bitset instead of a boolean flag to keep track of the GRFs we've seen a generating instruction for already. The search loop continues until all instructions initializing the value of the source VGRF have been found, or it is determined that coalescing is not possible. Fixes a few piglit test cases on Gen4-6 which were regressed by `6956015aa5` due to the different (yet perfectly valid) ordering in which copy instructions are emitted now by the simd lowering pass, which had the side effect of causing this optimization pass to start corrupting the program in cases where a VGRF-to-MRF copy instruction would be eliminated but only the last instruction writing to the source VGRF region would be rewritten to point to the target MRF. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `4fe4f6e8a7`)	2016-06-02 13:19:07 +01:00
Francisco Jerez	910fa7a824	i965/fs: Teach compute_to_mrf() about the COMPR4 address transformation. This will be required to correctly transform the destination of 8-wide instructions that write a single GRF of a VGRF to MRF copy marked COMPR4. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `1898673f58`)	2016-06-02 13:18:33 +01:00
Francisco Jerez	3b78304025	i965/fs: Refactor compute_to_mrf() to split search and rewrite into separate loops. This will allow compute_to_mrf to handle cases where the source of the VGRF-to-MRF copy is initialized by more than one instruction. In such cases we cannot rewrite the destination of any of the generating instructions until it's known whether the whole VGRF source region can be coalesced into the destination MRF, which will imply continuing the search until all generating instructions have been found or it has been determined that the VGRF and MRF registers cannot be coalesced. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `485fbaff03`)	2016-06-02 13:18:00 +01:00
Francisco Jerez	dd96daa55e	i965/fs: Fix compute-to-mrf VGRF region coverage condition. Compute-to-mrf was checking whether the destination of scan_inst is more than one component (making assumptions about the instruction data type) in order to find out whether the result is being fully copied into the MRF destination, which is rather inaccurate in cases where a single-component instruction is only partially contained in the source region, or when the execution size of the copy and scan_inst instructions differ. Instead check whether the destination region of the instruction is really contained within the bounds of the source region of the copy. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `4b0ec9f475`)	2016-06-02 13:17:26 +01:00
Francisco Jerez	a6011c6fc6	i965/fs: Simplify and improve accuracy of compute_to_mrf() by using regions_overlap(). Compute-to-mrf was being rather heavy-handed about checking whether instruction source or destination regions interfere with the copy instruction, which could conceivably lead to program miscompilation. Fix it by using regions_overlap() instead of the open-coded and dubiously correct overlap checks. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `bb61e24787`)	2016-06-02 13:16:52 +01:00
Francisco Jerez	2d83aad693	i965/fs: Teach regions_overlap() about COMPR4 MRF regions. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `88f380a2dd`)	2016-06-02 13:16:04 +01:00
Dylan Baker	665f57c513	Don't use python 3 Now there are not files that require python 3, so for now just remove the python 3 dependency and use python 2. I think the right plan is to just get all of the python ready for python 3, and then use whatever python is available. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `604010a7ed`)	2016-06-02 13:15:38 +01:00
Dylan Baker	7e62585ee8	genxml: change chbang to python 2 Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `ab31817fed`)	2016-06-02 13:15:08 +01:00
Dylan Baker	4dd70617a1	genxml: use the isalpha method rather than str.isalpha. This fixes gen_pack_header to work on python 2, where name[0] is unicode not str. Signed-off-by: Dylan Bake <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `12c1a01c72`)	2016-06-02 13:14:38 +01:00
Dylan Baker	9ed6965749	genxml: require future imports for python2 compatibility. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a45a25418b`)	2016-06-02 13:14:08 +01:00
Dylan Baker	aed6230269	genxml: mark re strings as raw This is a correctness issue. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `e5681e4d70`)	2016-06-02 13:13:39 +01:00
Dylan Baker	f73a68ec37	genxml: Make classes descendants of object This is the default in python3, but in python2 you get old style classes. No one likes old-style classes. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `de2e9da2e9`)	2016-06-02 13:13:09 +01:00
Dylan Baker	0c12887764	genxml: mark gen_pack_header.py as encoded in utf-8 There is unicode in this file, and I'm actually surprised that the python interpreter hasn't gotten grumpy. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `9f50e3572c`)	2016-06-02 13:12:35 +01:00
Marek Olšák	145705e49c	mesa: fix crash in driver_RenderTexture_is_safe This just fixed the crash with the apitrace in bug report. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95246 Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `8a10192b4b`)	2016-06-02 13:11:43 +01:00
Dave Airlie	d3c92267e0	glsl/images: bounds check image unit assignment The CTS test: GL45-CTS.multi_bind.dispatch_bind_image_textures binds 192 image uniforms, we reject this later, but not until after we trash the contents of the struct gl_shader. Error now reads: Too many compute shader image uniforms (192 > 16) instead of Too many compute shader image uniforms (2745344416 > 16) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `f87352d769`)	2016-06-02 13:10:50 +01:00
Ilia Mirkin	36e26f2ee2	nvc0/ir: fix spilling predicates to registers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `4b1a167a2b`)	2016-06-02 12:54:52 +01:00
Emil Velikov	9a56e7d25b	Update version to 12.0.0-rc1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 19:20:34 +01:00
Emil Velikov	7ad2cb6f08	docs: rename release notes to 12.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 19:20:34 +01:00
Emil Velikov	a43a368457	nir: add the SConscript.nir to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `922b471777`)	2016-05-30 19:20:33 +01:00
Rhys Kidd	f25fdf21e7	vc4: Fix doxygen warnings Now that vc4 automated code documentation can be generated with doxygen, fix the warnings issued by Doxygen 1.8.11. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:45 +01:00
Rhys Kidd	db975fa86c	doxygen: Plumb through gallium/ to automated documentation Add Gallium and the Gallium-based drivers to doxygen's automated code documentation infrastructure. Can be individually created with: cd $MESA_TOP_LEVEL/ make -C doxygen/ gallium.tag Benefits from the existing doxygen Makefile runners to clean up afterwards with 'make clean'. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:45 +01:00
Emil Velikov	26f4638684	Revert "osmesa: don't try to bundle osmesa.def SConscript" This reverts commit `c07df0f201`. Now that the SCons build is back we need to include the files in the tarball.	2016-05-30 17:53:45 +01:00
Andreas Fänger	9601815b4b	scons: build osmesa swrast and gallium This patch makes it possible to build classic osmesa/swrast on windows again. It was removed in commit `69db422218`. Although there is a gallium version of osmesa now, the swrast version still has more features lacking in llvmpipe, e.g. anisotropic filtering. Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> [Emil Velikov: remove trailing whitespace] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:45 +01:00
Emil Velikov	3689ef32af	automake: rework the git_sha1.h rule, include in tarball As we'll need the file in the release tarball, rework the rule so that the file is regenerated _only_ if we're in a git repository. With this in place we can build vulkan (anv) from a release tarball. Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Kristian Høgsberg Kristensen <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:45 +01:00
Emil Velikov	4cd9cd6abc	automake: move the git_sha1.h rule a level up This way we can reuse the header from other places like - src/intel/vulkan and src/gallium. Only the former is hooked up atm. Make sure .gitignore is updated, as well as all the users (the mesa code does not need any changes). Also ensure that the file is always created by adding it to the BUILT_SOURCES target. Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Kristian Høgsberg Kristensen <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:45 +01:00
Emil Velikov	13faddb6b8	mesa_glinterop: remove mesa_glinterop typedefs As is there are two places that do the typedefs - dri_interface.h and this header. As we cannot include the former in here, just drop the typedefs and use the struct directly (as needed). This is required because typedef redefinition is C11 feature which is not supported on all the versions of GCC used to build mesa. v2: Kill the typedef alltogether, as per Marek. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96236 Cc: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-30 17:53:44 +01:00
Emil Velikov	d43c894471	glx/glvnd: automake: include all the sources in libglx_la_SOURCES Otherwise the headers will be missing from the release tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:44 +01:00
Emil Velikov	f9db61d095	glx/glvnd: remove the final if defined($extension) guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:44 +01:00
Emil Velikov	3bf00b6c6a	glx/glvnd: rework dispatch functions/indices tables lookup Rather than checking if the function name maps to a valid entry in the respective table, just create a dummy entry at the end of each table. This allows us to remove some unnessesary "index >= 0" checks, which get executed quite often. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:44 +01:00
Emil Velikov	eab7e54981	glx/glvnd: Use strcmp() based binary search in FindGLXFunction() It will allows us to find the function within 6 attempts, out of the ~80 entry long table. v2: calculate middle on each iteration, correctly set the lower limit. Reviewed-by: Adam Jackson <ajax@redhat.com> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:44 +01:00
Chuck Atkins	f9a35bf012	configure.ac: correct the xlib/xlib-gallium GLX detection for GLVND Things have changed since commit `a92910a` ("glx: Refactor the configure options for glx implementation choice (v3)") where only a single configure option is used to control the GLX provider. [Emil Velikov: Ensure that the check is moved after the detection code.] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:34 +01:00
Kyle Brenneman	22a9e00aab	glx: Implement the libglvnd interface. With reference to the libglvnd branch: https://cgit.freedesktop.org/mesa/mesa/log/?h=libglvnd This is a squashed commit containing all of Kyle's commits, all but two of Emil's commits (to follow), and a small fixup from myself to mark the rest of the glX* functions as _GLX_PUBLIC so they are not exported when building for libglvnd. I (ajax) squashed them together both for ease of review, and because most of the changes are un-useful intermediate states representing the evolution of glvnd's internal API. Co-author: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2016-05-30 16:29:49 +01:00
Frederic Devernay	cee459d84d	gallivm: initialize init_native_targets_once_flag correctly Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-05-30 16:13:52 +02:00
Ilia Mirkin	8cc80e396e	nvc0/ir: fix emission of predicate spill to register The lane mask only applies to real mov's, while here we're using PSET. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-30 10:07:01 -04:00
Ilia Mirkin	9444d71611	nvc0: fix some compute texture validation bits on kepler (a) Make sure to update the TIC in case of an updated buffer address (b) Mark newly-inactive textures dirty so that we update the handle in set_tex_handles. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-30 10:07:01 -04:00
Dave Airlie	bac39dddcf	mesa/xfb: report calculated size for XFB buffer objects. This fixes: GL45-CTS.direct_state_access.xfb_buffers This test looks correct to me, we should work out the size value and report it rather than using only the size from the Range interface. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-30 21:18:54 +10:00
Emil Velikov	e7bd5b4b77	swr: automake: silence the python invocation Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:31:08 +01:00
Emil Velikov	04987ef229	swr: automake: attempt to fix the out-of-tree build Make sure that the output folder is created otherwise the python scripts yells at us. Cc: 0xe2.0x9a.0x9b@gmail.com Cc: Tim Rowley <timothy.o.rowley@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96238 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:31:07 +01:00
Emil Velikov	3a59a624d0	swr: remove LLVM dependency from source generation rules. The dependencies should not mention any files external to the project. If we want to do sanity checks for the LLVM installed on the system we should do that in configure, yet again where is the merit which header gets checked and which doesn't ? Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:31:07 +01:00
Emil Velikov	b05b782b43	swr: add all the generators to the release tarball. Namely the python scripts and the knobs.template. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:31:07 +01:00
Emil Velikov	38394b5d76	anv: automake: don't forget to cleanup dev_icd.json Otherwise `make distcheck' will barf at us as the file is dangling. Ideally this should be part of the clean-local hook, although we include install-lib-links.mk which already has one. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:29:21 +01:00
Emil Velikov	220d8c99fa	anv: automake: bring back VULKAN_ENTRYPOINT_CPPFLAGS We should not have removed them in the first place. There's a subtle difference between generating the complete sources and using them which was not obvious as we nuked them. Without this, the release tarball ends up without various hunks of the generated sources, thus things fail at a later stage as we attempt to build them. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:56 +01:00
Emil Velikov	82514f26d8	anv: automake: ship the json files in the release tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:53 +01:00
Emil Velikov	f80b10df8d	softpipe: add sp_buffer.h to the sources list (release tarball) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:28:53 +01:00
Emil Velikov	2f43908395	freedreno: make sure we pick up ir3_nir_trig.py in the release tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:28:53 +01:00
Emil Velikov	36859022ea	isl: add isl_priv.h to the sources list Otherwise it will be missing from the release tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:50 +01:00
Mauro Rossi	41d252e418	isl: move the sources lists to Makefile.sources [Emil Velikov: use the file in the autoconf build] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:48 +01:00
Emil Velikov	b4f6c70397	isl: automake: list builddir before srcdir in the includes list As seen elsewhere - we want to include the freshly built sources as opposed the the (likely) stale ones in the srcdir. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:46 +01:00
Emil Velikov	53a2167e68	isl: automake: flatten the tests rules Fold the unneeded extra variable tests_ldadd, the explicit sources section (single file with the default extension) and flip the check_PROGRAMS <> TESTS order (TESTS includes scripts, while check_PROGRAMS is binaries only). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:43 +01:00
Emil Velikov	1eecc09584	isl: automake: remove unneeded install-lib-links.mk include One uses the makefile to create compatibility symlinks (to $top_builddir/libs) for shared libraries/modules. As we don't create any here, there's no need to include the file. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:40 +01:00
Emil Velikov	afc1db739a	isl: automake: remove unneeded SUBDIRS As we do not include any other subdirs but self, we don't need to set it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:37 +01:00
Mauro Rossi	779653489e	genxml: move the sources (headers) list to Makefile.sources [Emil Velikov: use the file in the autoconf build] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-30 10:26:36 +01:00
Emil Velikov	ace5403453	anv: bail out if anv_wsi_init() fails Otherwise we'll end up setting up a device with no winsys integration. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> --- Hard-coding the rendernode name in anv_physical_device_init() is a bad idea really. We could/should be using drmGetDevices() to get info on all the devices (master/render/etc. node names, pci location etc.) and apply our heuristics on top of that. That can come up as a follow up change.	2016-05-30 10:26:36 +01:00
Emil Velikov	93e65fdcac	anv: resolve wayland-only build Ensure that the final X11/XCB hunk is guarded by the correct macro. Otherwise we'll require the symbol even when building without said platform. Cc: Cedric Sodhi <manday@openmail.cc> Reported-by: Cedric Sodhi <manday@openmail.cc> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:26:35 +01:00
Robert Foss	5068d307f9	anv: Fix use of uninitialized variable. The return variable was not set for failure paths. It has now been changed to VK_ERROR_INITIALIZATION_FAILED for failure paths. Coverity: 1358944 Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil Velikov: rebase against master, s/vulkan/anv/] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:26:35 +01:00
Stanimir Varbanov	e382bc649b	gallium: push offset down to driver Push offset down to drivers when importing dmabuf. This is needed to more fully support EGL_EXT_image_dma_buf_import when a non-zero offset is specified. Tesing has been done for freedreno, and compile tested following gallium drivers: nouveau,svga,virgl,r600,r300,radeonsi,swrast,i915,ilo Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-30 10:26:35 +01:00
Stanimir Varbanov	30d28d7c31	st/dri: cleanup image_from_fd/dma_buf paths Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-30 10:26:35 +01:00
Stanimir Varbanov	9d852a1f75	st/dri: add handling of R8 and GR88 DRI fourcc formats This helps to import dmabuf buffers from DRM_FORMAT_R8 and DRM_FORMAT_GR88 used for example by GStreamer for YUV to RGB conversion using shaders. Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-30 10:26:35 +01:00
Bas Nieuwenhuizen	e9d3246a7a	radeonsi: Don't offset OFFCHIP_BUFFERING on pre-VI cards. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96239 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-30 09:59:50 +02:00
Francisco Jerez	d8cf982f7d	i965: Expose GL 4.3 on Gen8+. ARB_compute_shader was the last feature missing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	4decc426c2	i965/fs: Skip gen4 pre/post-send dependency workaronds for the first/last block. We know that there cannot be any destination dependency race if we reach the beginning or end of the program without having found any other instruction the send could possibly race with. This avoids emitting a pile of useless moves at the beginning or end of the program in the most common case in which the program has a single basic block only. On the original i965 I get the following shader-db results: total instructions in shared programs: 3354165 -> 3215637 (-4.13%) instructions in affected programs: 3183065 -> 3044537 (-4.35%) helped: 13498 HURT: 0 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	daf4a71883	i965/fs: Skip SIMD lowering source unzipping for regular scalar regions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	6956015aa5	i965/fs: Factor out region zipping and unzipping from the SIMD lowering pass. Just to make sure we keep the SIMD lowering pass tidy when we introduce additional logic to try to optimize out the copy instructions used to zip and unzip the destination and source regions into multiple packed regions of the lowered instruction width. Shouldn't cause any functional changes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	a9f00a9e53	i965/fs: Generalize regions_overlap() from copy propagation to handle non-VGRF files. This will be useful in several places. The only externally visible difference (other than non-VGRF files being supported now) is that the region sizes are now passed in byte units instead of in GRF units because the loss of precision would have become a problem in the SIMD lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	4db93592de	i965/fs: Refactor offset() into a separate function taking the width as argument. This will be useful in the SIMD lowering pass to avoid having to construct a builder object of the known region width just to pass it as argument to offset(), which doesn't do anything with it other than taking the builder dispatch_width as region width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	a5b4f63c15	i965/fs: Implement opt_sampler_eot() in terms of logical sends. This makes the whole LOAD_PAYLOAD munging unnecessary which simplifies the code and will allow the optimization to succeed in more cases independent of whether the LOAD_PAYLOAD instruction can be found or not. The following patch is squashed in: SQUASH: i965/fs: Add basic dataflow check to opt_sampler_eot(). The sampler EOT optimization pass naively assumes that the texturing instruction provides all the data used by the FB write just because they're standing next to each other. The least we should be checking is whether the source and destination regions of the FB write and texturing instructions match. Without this the previous seemingly harmless patch would have caused opt_sampler_eot() to misoptimize a shader from dota-2 causing DCE to eliminate all of its 78 instructions except for the final sampler EOT message (!). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	a0d9aed268	i965/fs: Fix UB list sentinel dereference in opt_sampler_eot(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	2a166c13d4	i965/fs: Take opt_redundant_discard_jumps out of the optimization loop. No shader-db regressions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	d5f2f32b11	i965/fs: Run SIMD and logical send lowering after the optimization loop. There are two reasons why this is useful: - It avoids the introduction of an amount of partial writes emitted by the SIMD lowering pass to zip and unzip register regions early during optimization, which can make subsequent optimization less effective. - It substantially reduces the burden on the compiler when a large fraction of the instructions in the program need to be split (e.g. during SIMD32 builds). Individual halves of split instructions will be optimized identically (if they can still be optimized at all), so doing it up front can duplicate the amount of instructions the optimizer has to deal with which causes the compilation time to explode in some cases due to the worse-than-linear runtime behaviour of the back-end. It seems helpful to re-run a few optimization passes in cases where any of the lowering passes was able to make progress. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	e9eb59ba68	i965/fs: Add FS_OPCODE_FB_WRITE_LOGICAL to has_side_effects(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	48d743c501	i965/fs: Allow constant propagation into logical send sources. Logical sends are eventually lowered into a series of copies so they can take almost anything as source. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:37 -07:00
Francisco Jerez	f1a607cf68	i965/fs: Let CSE handle logical sampler sends as expressions. This will prevent some shader-db regressions when we start plumbing logical sends through the optimizer. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:37 -07:00
Francisco Jerez	b0c8e5e0c8	i965/fs: Pass a BAD_FILE register to the logical FB write when oMask is unused. This will let the optimizer know that the sample mask value is unused so its definition can be DCE'ed. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:37 -07:00
Timothy Arceri	aac90ba292	glsl: fix xfb_offset unsized array validation This partially fixes CTS test: GL44-CTS.enhanced_layouts.xfb_get_program_resource_api The test now fails at a tes evaluation shader with unsized output arrays. The ARB_enhanced_layouts spec says: "It is a compile-time error to apply xfb_offset to the declaration of an unsized array." So this seems like a bug in the CTS. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-30 15:11:47 +10:00
Timothy Arceri	87fb5aa3e7	glsl: dont crash when attempting to assign a value to a builtin define For example GL_ARB_enhanced_layouts = 3; Fixes: GL44-CTS.enhanced_layouts.glsl_contant_immutablity Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-30 12:47:58 +10:00
Dave Airlie	d98d6e6269	egl/dri3: don't crash on no context. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94925 Pointed out by Karol Herbst on irc. Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-30 11:30:04 +10:00
Dave Airlie	e2791b38b4	mesa/program_interface_query: fix transform feedback varyings. The spec says gl_NextBuffer and gl_SkipComponents need to be returned to userspace in the program interface queries. We currently throw those away, this requires a complete piglit run to make sure no drivers fallover due to the extra varyings. This fixes: GL45-CTS.program_interface_query.transform-feedback-built-in Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-30 11:26:50 +10:00
Dave Airlie	6effdce92e	glsl/ast: subroutineTypes can't be returned from functions. These types can't be returned. This fixes: GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types for the return type case. Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-30 11:25:30 +10:00
Timothy Arceri	db2a35193f	glsl: use has_double() helper Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-30 11:01:40 +10:00
Timothy Arceri	8f4ac20b6f	glsl: fix explicit uniform block alignment This stops the offset being bumped again when and an explicit alignment has already been applied. Fixes alignment issues in: GL44-CTS.enhanced_layouts.uniform_block_alignment Note the test still fails due to unrelated issues with doubles. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-30 11:01:32 +10:00
Jordan Justen	7398a32c50	i965: Shrink stage_prog_data param array length It appears we were over-allocating these arrays. Previously we would use nir->num_uniforms directly for scalar programs, and multiply it by 4 for vec4 programs. Instead we should have been dividing by 4 in both cases to convert from bytes to a gl_constant_value count. The size of gl_constant_value is 4 bytes. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 09:59:55 -07:00
Ilia Mirkin	160063b110	nv50,nvc0: fix the max_vertices=0 case This is apparently legal. Drop any emit/restarts, and pass a 1 to the hardware. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-29 09:34:03 -04:00
Ilia Mirkin	f2e7268a55	st/mesa: fix setting of point_size_per_vertex in ES contexts GL ES 2.0+ does not have a GL_PROGRAM_POINT_SIZE enable, unlike desktop GL. So we have to go and check the last pre-rasterizer stage to see whether it outputs a point size or not. This fixes a number of dEQP tests that use a geometry or tessellation shader to emit points primitives. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-29 09:34:03 -04:00
Marek Olšák	04a78068ff	mesa: skip level checking for FramebufferTexture*D if texture is zero From the OpenGL 4.5 core spec: "An INVALID_VALUE error is generated if texture is not zero and level is not a supported texture level for textarget, as described above." Other FramebufferTexture functions already do the right thing. This fixes the main menu in F1 2015. Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-29 14:24:23 +02:00
Ilia Mirkin	60341ddd5c	st/mesa: expose OES_shader_io_blocks when we have enough for ES 3.1 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-28 20:58:12 -04:00
Vinson Lee	884ac61722	swr: [rasterizer] Do not define _mm256_storeu2_m128i with icc. Fix build error with icc. CXX libswrAVX_la-swr_clear.lo icpc: command line warning #10006: ignoring unknown option '-Wdelete-non-virtual-dtor' In file included from ./rasterizer/jitter/jit_api.h(31), from swr_context.h(30), from swr_clear.cpp(24): ./rasterizer/common/os.h(135): error: expected an identifier void _mm256_storeu2_m128i(__m128i hi, __m128i lo, __m256i a) ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-05-28 14:26:54 -07:00
Thomas Hindoe Paaboel Andersen	df210ff24d	i965: add missing return in if statement Re-add the "return false" that was removed in `0c02d7002d` It seems that something went wrong when merging the patch. The patch sent to the mailing list does not directly match what was committed. https://lists.freedesktop.org/archives/mesa-dev/2016-May/118198.html Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-28 11:26:33 -07:00
Ilia Mirkin	c7731a0740	gk110/ir: fix unspilling of predicates from registers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96258 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>	2016-05-28 13:14:19 -04:00
Samuel Pitoiset	697237b71e	nvc0: remove outdated surfaces validation code for GK104 This code was used for validating surfaces with compute but now we use pipe_image_view instead. Anyway, surfaces support should be re-introduced properly once OpenCL happens. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-28 15:50:07 +02:00
Samuel Pitoiset	f07ade6881	nvc0: do not always invalidate 3D CBs when using compute Constant buffers are aliased between 3D and CP on Fermi, but we should only invalidate them when a compute shader actually uses CBs and not all the time after a lauching grid. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-28 15:50:03 +02:00
Francisco Jerez	357495b94d	i965: Update compute workgroup size limit calculation for SIMD32. This should have the side effect of enabling the ARB_compute_shader extension on Gen8+ hardware and all Gen7 platforms that didn't previously expose it (VLV and IVB GT1) due to the number of hardware threads per subslice being insufficient in SIMD16 mode. v2: Bump workgroup size limit for GLES too (Jordan). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 23:29:06 -07:00
Francisco Jerez	46ce93ed22	i965: Add do32 debug option. The do32 INTEL_DEBUG option causes the back-end to try to generate a SIMD32 program when compiling a compute shader regardless of the specified compute shader workgroup size, which will be useful for testing SIMD32 code generation in the most common case in which the workgroup size doesn't exceed the SIMD16 limit so SIMD32 codegen wouldn't be automatically enabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	864737ce6c	i965/fs: Build 32-wide compute shader when needed. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	37fd13ee2d	i965/fs: Extend back-end interface for limiting the shader dispatch width. This replaces the current fs_visitor::no16() interface with fs_visitor::limit_dispatch_width(), which takes an additional parameter allowing the caller to specify the maximum dispatch width a shader can be compiled with. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	2d288cb9ea	i965/fs: Implement SIMD32 register allocation support. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	7f10d3983b	i965/fs: Remove pre-Gen7 register allocation class micro-optimization. This was trying to save some one-time init on pre-Gen7 hardware under the assumption that one would only ever need 1, 2, 4 and 8-wide registers on those platforms. However nothing guarantees that those will be the only VGRF sizes used after lowering and optimization. In some cases we may end up with a temporary of different size being allocated (e.g. by SIMD lowering to zip or unzip a multi-component register region of a logical send instruction), and there is no guarantee that they will be optimized away before register allocation (especially since the compute_to_mrf coalescing pass is rather... lacking...). Instead just allocate classes for all possible VGRF sizes up to MAX_VGRF_SIZE to avoid a crash in pq_test() when we encounter a variable of any other size. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	1d5bf46ad1	i965/fs: Don't mutate multi-component arguments in sampler payload set-up. The Gen5+ sampler message payload construction code steps through the coordinate and derivative components by induction like 'coordinate = offset(coordinate, bld, 1)', the problem is that while doing that it may step one past the end of the coordinate vector causing an assertion failure in offset() if it happens to be a (single component) immediate. Right now coordinates and derivatives are typically passed as actual registers but that will no longer be the case when we start propagating constants into logical messages. Instead express coordinate components in closed form like 'offset(coordinate, bld, i)' -- The end result seems slightly more readable that way and it allows passing the coordinate and derivative registers by const reference instead of by value, so it seems like a clean-up in its own right. v2: Fold a few post-increment operators into the last MOV statement. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	ad8f66ed33	i965/fs: Fix multiple ACP interference during copy propagation. This is more fallout from `cf375a3333`. It's possible for multiple ACP entries to interfere with a given VGRF write, so we need to continue iterating even if an overlapping entry has already been found. Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	c88b52745c	i965/fs: Fix cmod propagation not to propagate non-identity cmod into CMP(N). The conditional mod of these instructions determines the semantics of the comparison itself (rather than being evaluated based on the result of the instruction as is usually the case for most other instructions that allow conditional mods), so it's in general not legal to propagate a conditional mod into a CMP instruction. This prevents cmod propagation from (mis)optimizing: cmp.z.f0 tmp, ... mov.z.f0 null, tmp into: cmp.z.f0 tmp, ... which gives the negation of the flag result of the original sequence. I could reproduce this easily with SIMD32 but I don't see any reason why the problem would be SIMD32-specific, it was most likely working by luck. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	8476233ae2	i965/fs: Estimate number of registers written correctly in opt_register_renaming. The current estimate is incorrect for non-32b types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	437e65f9d9	i965/fs: Add (sub)reg_offset asserts to brw_reg_from_fs_reg. These are completely ignored by the conversion to brw_reg, so they better be zero. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	51dd6a60f5	i965/fs: Reset reg_offset of the original destination to zero in compute_to_mrf(). Prevents an assertion failure in the following commit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	b9eab911ba	i965/fs: Skip remove_duplicate_mrf_writes() during SIMD32 runs. The pass is disabled in SIMD16 dispatch mode for the same reason, it cannot handle instructions that write multiple MRF registers at once. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	796238d9e6	i965/fs: Use SIMD8 SSBO GET_BUFFER_SIZE message regardless of the dispatch width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	29e4717251	i965/fs: Don't emit duplicated SSBO GET_BUFFER_SIZE instruction unnecessarily. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	a55452530f	i965/fs: Emit fixed width memory fence opcode regardless of the dispatch width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	ae730049c6	i965/fs: Return 32 bit mask from fs_builder::sample_mask(). This doesn't actually handle the FS case, just add an assertion for the moment so I don't forget to update it later on for SIMD32 fragment shader dispatch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	8b6edee679	i965/fs: Emit fixed-width null register regardless of the dispatch width. brw_null_vec() cannot handle widths over 16 but it doesn't really matter what width we specify for null registers because destination regions have no width field at the hardware level. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	298320280f	i965/fs: Fix half() to handle more exotic register files. horiz_offset() is able to deal with a superset of the register files currently special-cased in half(). Just call horiz_offset() in all cases. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	8c9601ef7b	i965/fs: Fix horiz_offset() to handle ARF and HW GRF register files. We'll hit these in some cases during SIMD lowering in 32-wide programs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	7d430fc05e	i965/fs: Clean up remaining uses of fs_inst::reads_flag and ::writes_flag. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	ecd7a7255a	i965/fs: Keep track of flag dependencies with byte granularity during scheduling. This prevents false dependencies from being created between instructions that write disjoint 8-bit portions of the flag register and OTOH should make sure that the scheduler considers dependencies between instructions that write or read multiple flag subregisters at once (e.g. 32-wide predication or conditional mods). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	0fec265373	i965/fs: Track flag register liveness with byte granularity. This is required for correctness in presence of multiple 8-wide flag writes (e.g. 8-wide instructions with a conditional mod set) which update a different portion of the same 16-bit flag subregister. Right now we keep track of flag dataflow with 16-bit granularity and consider flag writes to have killed any previous definition of the same subregister even if the write was less than 16 channels wide, which can cause live flag register updates to be dead code-eliminated incorrectly. Additionally this makes sure that we handle 32-wide flag writes and reads which may span multiple flag subregisters so the current approach of just setting/testing a single bit from the live set wouldn't have worked. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	df1aec763e	i965/fs: Define methods to calculate the flag subset read or written by an fs_inst. v2: Codestyle fixes (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	ece41df247	i965/fs: Expose arbitrary channel execution groups to the IR. This generalizes the current fs_inst::force_sechalf flag to allow specifying channel enable groups other than 0 or 8. At some point it will likely make sense to fix the vec4 generator to support arbitrary execution groups and then move the definition of fs_inst::group into backend_instruction (e.g. so we can do FP64 in the VEC4 back-end). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	81bc6de8c0	i965/ir: Make BROADCAST emit an unmasked single-channel move. Alternatively we could have extended the current semantics to 32-wide mode by changing brw_broadcast() to emit multiple indexed MOV instructions in the generator copying the selected value to all destination registers, but it seemed rather silly to waste EU cycles unnecessarily copying the exact same value 32 times in the GRF. The vstride change in the Align16 path is required to avoid assertions in validate_reg() since the change causes the execution size of the MOV and SEL instructions to be equal to the source region width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	41562eb8f3	i965/fs: Allow specifying arbitrary quarter control to FIND_LIVE_CHANNEL. This makes FIND_LIVE_CHANNEL behave like a normal instruction for non-zero quarter control. On Gen8+ we just leave the quarter control field of the emitted FBL instruction set to the default value so the hardware applies the expected shift to the execution mask signals. On Gen7 we apply the offset manually by specifying a non-zero subregister offset in the source region of the FBL instruction. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	a5a0810960	i965/fs: Allow specifying arbitrary execution sizes up to 32 to FIND_LIVE_CHANNEL. Due to a Gen7-specific hardware bug native 32-wide instructions get the lower 16 bits of the execution mask applied incorrectly to both halves of the instruction, so the MOV trick we currently use wouldn't work. Instead emit multiple 16-wide MOV instructions in 32-wide mode in order to cover the whole execution mask. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	1e3c58ffaf	i965/fs: Lower 32-wide scratch writes in the generator. The hardware has messages that can write 32 32bit components at once but the channel enable mask gets messed up. We need to split them into several 16-wide scratch writes for the channel enables to be applied correctly. The SIMD lowering pass cannot be used for this because scratch writes are emitted rather late during register allocation long after SIMD lowering has been done. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:02 -07:00
Francisco Jerez	a7d319c00b	i965/fs: Implement scratch reads and writes of 4 GRFs at a time. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	fe5cdde2f9	i965/eu: Fix Gen7+ DP scratch message size calculation on Gen7. Gen7 hardware expects the block size field in the message descriptor to be the number of registers minus one instead of the log2 of the number of registers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	fc7107de1d	i965/eu: Set execution size explicitly for memory fence send message. We don't want to emit a 32-wide send message in 32-wide programs. The memory fence message should have the same effect regardless of the execution size (as long as it's valid) so just set it to one. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	5c887326c5	i965/eu: Consider QtrCtrl 3Q-4Q in typed surface message descriptor setup. In SIMD32 programs the compiler is responsible for providing the appropriate half of the sample mask in the message header, so the first and third quarters both map to the first slot group of the provided 16-bit half, while the second and fourth quarters map to the second slot group -- IOW they should be equivalent to 1Q and 2Q modulo two. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	448340d31f	i965/fs: Clean up remaining uses of dispatch_width in the generator. Most of these are bugs because the intended execution size of an instruction and the dispatch width of the shader aren't necessarily the same (especially in SIMD32 programs). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	7f28ad8c4d	i965/eu: Remove brw_codegen::compressed and ::compressed_stack. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	646213168e	i965/eu: Use current exec size instead of p->compressed in surface message generation. This was kind of an abuse of p->compressed, dataport send message instructions are always uncompressed. Use the current execution size instead since p->compressed is on its way out. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:46 -07:00
Francisco Jerez	492286e90b	i965/fs: No need to reset predicate control after emitting some instructions. Trivial clean-up. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	8ef5637729	i965/fs: Pass current execution size to brw_IF() and brw_DO(). This gets IF and DO instructions working in SIMD32 programs. brw_IF() and brw_DO() should probably behave in the same way as other generator functions that emit control flow instructions and just figure out the right execution size by themselves from the current execution controls specified through the brw_codegen argument. Changing that will require updating lots of Gen4-5 clipper code though, so for the moment just pass the current value redundantly from the FS generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	fdae8b9f91	i965/eu: Stop using p->compressed to specify the exec size of control flow instructions. p->compressed won't work for SIMD32, we should just be using the execution size value specified via p->current instead. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	0b4cd91071	i965/fs: Extend region width calculation to allow arbitrary execution sizes. Instead of just halving the execution size when the instruction is compressed hoping that it will give a legal source region width, we can calculate the maximum legal width value in closed form from the component size and stride. This makes sure that brw_reg_from_fs_reg() always returns a valid hardware region even for virtual 32-wide instructions (e.g. send-like instructions) that would seem to exceed the hardware region width limit after halving. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Kenneth Graunke	dabaf4fb96	i965/fs: Pass the compression mode to brw_reg_from_fs_reg(). Curro is planning to eliminate p->compressed, so let's avoid using it here and just pass in the value directly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [ Francisco Jerez: Pass boolean flag instead of brw_compression enum. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	3340a66fce	i965/fs: Simplify per-instruction compression control setup in generator. By using the new compression/group control interface. This will allow easier extension to support arbitrary channel enable groups at the IR level. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	c78edcea8b	i965/fs: No need to set compression control at the top of generate_code(). The right value is dependent on the specific IR instruction being generated so it has to be reset in every iteration of the loop anyway. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	c19c3d3a52	i965/eu: Fix a bunch of compression control bugs in the generator. Most of these were resetting quarter control to zero incorrectly even though everything they needed to do was disable instruction compression -- The brw_SAMPLE() case was doing the right thing but it can be simplified slightly by using the new compression control interface. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	3dffd81583	i965/eu: Define alternative interface for setting compression and group controls. This implements some simple helper functions that can be used to specify the group of channel enable signals and compression enable that apply to a brw_inst instruction. It's intended to replace brw_set_default_compression_control eventually because the current interface has a number of shortcomings inherited from the Gen-4-5-centric representation of compression and group controls as a single non-orthogonal enum: On the one hand it doesn't work for specifying arbitrary group controls other than 1Q and 2Q, which are frequently useful in SIMD32 and FP64 programs. On the other hand the current interface forces you to update the compression and group controls simultaneously, which has been the source of a number of generator bugs (a bunch of them fixed in this series), because in many cases we would end up resetting the group controls to zero inadvertently even though everything we wanted to do was disable instruction compression -- The latter seems especially unfortunate on Gen6+ hardware which have no explicit compression control, so we would end up bashing the quarter control field of the instruction for no benefit. Instead of a single function that updates both at the same time introduce separate interfaces to update one or the other independently preserving the current value of the other (which typically comes from the back-end IR so it has to be respected). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	5db4d62395	i965/fs: Remove FS_OPCODE_PACK_STENCIL_REF virtual instruction. It's just a byte MOV with strided source. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	29ce110be6	i965/fs: Remove extract virtual opcodes. These can be easily represented in the IR as a MOV instruction with strided source so they seem rather redundant. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:09 -07:00
Francisco Jerez	9dcb8ff6a1	i965: Define brw_int_type() helper. Intended as a (partial) inverse of type_sz(). Will be useful in the next commit and some other SIMD32 generator changes I have queued up. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:09 -07:00
Francisco Jerez	bb89beb26b	i965/fs: Remove manual splitting of DDY ops in the generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:02 -07:00
Francisco Jerez	982c48dc34	i965/fs: Remove manual unrolling of BFI instructions from the generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:23 -07:00
Francisco Jerez	95272f5c7e	i965/fs: Drop Gen7 CMP SIMD unrolling workaround from the generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:23 -07:00
Francisco Jerez	f14b9ea6e6	i965/fs: Drop lowering code for a few three-source instructions from the generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:23 -07:00
Francisco Jerez	117a9a0a64	i965/fs: Set default access mode to Align1 for all instructions in the generator. Currently the generator code for most opcodes honours the default access mode (which should typically be Align1 in the scalar back-end), but generate_code() doesn't set it explicitly which means that the access mode from a previous instruction could leak into the following ones if you did something special and weren't careful enough to save and restore the previous access mode. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	3a541d0c0b	i965/fs: Remove handcrafted math SIMD lowering from the generator. Most of this wouldn't have worked for SIMD32 and had various dispatch_width and compression control bugs. It's mostly dead now with SIMD lowering of math instructions turned on in the compiler. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	cf5443f984	i965/fs: Limit SIMD width of various virtual opcodes to the maximum supported value. Which is 16 or 8 in most cases. This will make sure that 32-wide virtual instructions get chopped up into chunks of their maximum execution size. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	197833caa3	i965/fs: Lower LOAD_PAYLOAD instructions of unsupported width. Only per-channel LOAD_PAYLOAD instructions can be lowered, which should cover everything that comes in from the front-end. LOAD_PAYLOAD instructions used to construct actual message payloads cannot be easily lowered because they contain headers and vectors of variable type that aren't necessarily channel-aligned -- We shouldn't find any of them in the program at SIMD lowering time though because they're introduced during logical send lowering. An alternative that may be worth considering would be to re-run the SIMD lowering pass after LOAD_PAYLOAD lowering instead of this patch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	9eea3df29f	i965/fs: Lower DDY instructions to SIMD8 during SIMD lowering time ...on hardware lacking compressed Align16 support. Will allow simplifying the generator code and fixing it for SIMD32 codegen. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	12ae87abb1	i965/fs: Apply usual FPU-like execution size restrictions to MULH. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	dea9c1df89	i965/fs: Calculate maximum execution size of MOV_INDIRECT correctly. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	122e031548	i965/fs: Assert that IF instruction with embedded compare has legal exec_size. We shouldn't encounter these right now but if we did it wouldn't be possible for the SIMD lowering pass to split it into multiple instructions because of its side effects on control flow, so just assert in order to kill the program. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	98c8bef01c	i965/fs: Implement HSW BFI exec size workarounds in the SIMD lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	88d9cc1563	i965/fs: Implement workaround for IVB CMP dependency race in the SIMD lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	a6bf5f88c7	i965/fs: Enforce common regioning restrictions by SIMD splitting. This change addresses a number of hardware restrictions on the source and destination regions and other execution controls of regular FPU-like instructions that in some cases can be avoided by reducing the execution size of the instruction. Some of these restrictions (e.g. the one about 3src instructions not supporting compression on some hardware) are currently being worked around case by case in the generator with ad-hoc splitting code that is buggy in several ways (e.g. doesn't handle non-trivial execution controls which would break SIMD32 code), but it seems cleaner to implement as many restrictions as we can in a single lowering pass since that will allow us to simplify some of the surrounding code considerably and also make sure that we don't forget applying them in the future. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	2b5adb942b	i965/fs: Enforce extended math exec size limits during SIMD lowering. This teaches the SIMD lowering pass about the hardware limits on the execution size of math instructions, which will allow simplifying the generator code and at the same time get rid of a number of bugs in the manual SIMD unrolling done currently that prevent SIMD32 codegen from working. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	a8e7b4f1d9	i965/fs: Handle SAMPLEINFO consistently like other texturing instructions. Seems like this texturing opcode was missing its logical counterpart which would prevent it from taking advantage of the SIMD lowering infrastructure, define it and plumb it through the back-end. At some point we'll likely want to emit a single SAMPLEINFO message shared among all channels irrespective of this change, but for the moment this should be enough to get the intrinsic working in SIMD32 mode. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	99b5476d33	i965/fs: Lower math into Gen4-5 send-like instructions in lower_logical_sends. The benefit is we will be able to use the SIMD lowering pass to unroll math instructions of unsupported width and then remove some cruft from the generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	e531b7907a	i965/fs: Add missing get_latency_gen7() cases for the Gen7 pull constant opcodes. This was causing the scheduler to be rather optimistic about the latency of pull constant opcodes on Gen7+. This might seem to increase the cycle count estimate calculated by the scheduler itself for some shaders, even though the actual cycle count should actually be decreased. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	ed4d0e41ac	i965/fs: Rename Gen4 physical varying pull constant load opcode. For consistency with the Gen7 variant. I'm not doing the same to the uniform pull constant message at this point because the non-GEN7 one is still overloaded to be either an expression-like logical instruction or a Gen4-specific physical send message. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	64a6cb87f1	i965/fs: Implement promotion of varying pull loads on Gen4 during SIMD lowering. Varying pull constant loads inherit the same limitation of pre-ILK hardware that requires expanding SIMD8 texel fetch instructions to SIMD16, we can deal with pull constant loads in the same way it's done for texturing during SIMD lowering. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	d8a3294ac2	i965/fs: Hide varying pull constant load message setup behind logical opcode. This will allow the SIMD lowering pass to split 32-wide varying pull constant loads (not natively supported by the hardware) into 16-wide instructions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	0bc5ad8d19	i965/fs: Avoid constant propagation when the type sizes don't match. The case where the source type of the instruction is smaller than the immediate type could be handled by calculating the portion of the immediate read by the instruction (assuming that the source channels are aligned with the destination channels of the copy) and then representing the same value as an immediate of the source type (assuming such an immediate type exists), but the code below doesn't do that, so just bail for the moment. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	52cc80d859	i965/fs: Fix CSE temporary copy for some LOAD_PAYLOAD corner cases. If the LOAD_PAYLOAD instruction only has header sources it's possible for the number of registers written to be less than or equal to the SIMD component size, in which case it would take the single-MOV path at the bottom which would cause the channel enable masks to be applied incorrectly to the header contents and/or cause it to write past the end of the allocated temporary. If the instruction is either LOAD_PAYLOAD or doesn't write exactly one component the MOV path is going to mess up the program so just don't use it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	c5f224145a	i965/fs: Handle instruction predication in SIMD lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	1760c24b4b	i965/fs: No need to unzip SIMD-periodic sources during SIMD lowering. If the source value is going to the same for all SIMD-lowered chunks of the instruction there should be no need to unzip the value into multiple temporary registers one for each lowered chunk. As a side effect this fixes SIMD lowering of instructions with a vector immediate source. In the long term it might still be worth fixing offset() to handle vector immediates correctly though, this should be good enough for the moment. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	168163f5f0	i965/fs: Generalize is_uniform() to is_periodic(). This will be useful in the SIMD lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	b736e78ddb	i965/fs: Fix byte_offset() for MRF/ARF/FIXED_GRF regs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	2db9dd5aeb	i965/fs: Fix off-by-one region overlap comparison in copy propagation. This was introduced in `cf375a3333` but the blame is mine because the pseudocode I sent in my review comment for the original patch suggesting to do things this way already had the off-by-one error. This may have caused copy propagation to be unnecessarily strict while checking whether VGRF writes interfere with any ACP entries and possibly miss valid optimization opportunities in cases where multiple copy instructions write sequential locations of the same VGRF. Cc: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-05-27 23:19:20 -07:00
Ronie Salgado	8f538d9ae0	anv/cmd_buffer: Don't delete command buffers in ResetCommandPool() v2 (Jason Ekstrand): Destroy command buffers in DestroyCommandPool(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95034 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 18:56:33 -07:00
Brian Paul	747754f027	gallium/util: another s/unsigned/enum pipe_prim_type/ for clang Trivial.	2016-05-27 18:42:21 -06:00
Jason Ekstrand	b93b5935a7	anv: Try the first 8 render nodes instead of just renderD128 This way, if you have other cards installed, the Vulkan driver will still work. No guarantees about WSI working correctly but offscreen should at least work. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95537	2016-05-27 17:18:33 -07:00
Jason Ekstrand	e023c104f7	anv: strdup the device path into the physical device This way we don't have to assume that the string coming in is a piece of constant data that exists forever.	2016-05-27 17:18:33 -07:00
Jason Ekstrand	9048dee328	anv/formats: Exit early for unsupported formats	2016-05-27 17:17:09 -07:00
Jason Ekstrand	10bc9f7024	anv/formats: Map VK_FORMAT_UNDEFINED to ISL_FORMAT_UNSUPPORTED At one point in time, we may have used the mapping to ISL_FORMAT_RAW for certain buffer surfaces but that time has long since passed. This fixes a bug where doing format queries on VK_FORMAT_UNDEFINED would assert-fail.	2016-05-27 17:17:09 -07:00
Jason Ekstrand	b16326c740	anv/clear: Remove an unused variable	2016-05-27 17:17:09 -07:00
Brian Paul	8beb6f3c9c	gallium/util: another unsigned -> enum pipe_prim_type change gcc didn't warn about the unsigned / enum pipe_prim_type mismatch between the .c and .h file. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-27 17:55:05 -06:00
Jordan Justen	47e2a57fe9	i965/compute: Fix uniform init issue when SIMD8 is skipped In `d8347f12ea`, we added support for skipping SIMD8 generation when the program local size is too large for SIMD8 to be usable. This change was missed in that commit. This bug would impact gen7 platforms when the compute shader local size is greater than 512, and gen8 platforms when the local size is greater than 448. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-27 16:44:00 -07:00
Bas Nieuwenhuizen	65d4ba6f20	docs: Mention GL4.3 and ES3.1 support for nvc0 and radeonsi v2: also update the introductory text. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-28 01:04:03 +02:00
Jason Ekstrand	fb2a5ceb32	anv: Emit DRAWING_RECTANGLE once at driver initialization Also, we don't actually need it for clipping because meta always colors inside the lines and, for all other operations, the user is required to set a scissor. Since DRAWING_RECTANGLE stalls the GPU, we want to emit it as little as possible. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 15:18:11 -07:00
Jason Ekstrand	3a83c176ea	anv/cmd_buffer: Only emit PIPE_CONTROL on-demand This is in contrast to emitting it directly in vkCmdPipelineBarrier. This has a couple of advantages. First, it means that no matter how many vkCmdPipelineBarrier calls the application strings together it gets one or two PIPE_CONTROLs. Second, it allow us to better track when we need to do stalls because we can flag when a flush has happened and we need a stall. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 15:18:09 -07:00
Jason Ekstrand	7120c75ec3	genxml: Make PIPE_CONTROL::CommandStreamerStallEnable a boolean This has been declared as a uint since SNB but it's only one bit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 15:18:07 -07:00
Jason Ekstrand	b26bd6790d	anv/clear: Only clear the render area when doing subpass clears Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 15:18:04 -07:00
Jason Ekstrand	5432487792	anv: Move push constant allocation to the command buffer Instead of blasting it out as part of the pipeline, we put it in the command buffer and only blast it out when it's really needed. Since the PUSH_CONSTANT_ALLOC commands aren't pipelined, they immediately cause a stall which we would like to avoid. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 15:17:43 -07:00
Bas Nieuwenhuizen	2cee0d0f9c	radeonsi: enable OpenGL 4.3 Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-27 22:28:11 +02:00
Dave Airlie	0438bc76e2	nouveau: enable GL 4.3 on kepler/fermi Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-28 05:52:13 +10:00
Marek Olšák	43550f25ed	radeonsi: always reserve output space for tess factors Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dave Airlie <airlied@redhat.com>	2016-05-27 21:40:43 +02:00
Dave Airlie	c44513a1f3	glsl/linker: call link_uniform blocks on linked shader. The old code called this on the prelinked shader list, but at this point we have the linked shader, so we should call the interface on that alone. This fixes a regression in: dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.13 introduced in `5b2675093e` glsl: handle implicit sized arrays in ssbo Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96228 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reported-by: Mark James Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-28 05:35:53 +10:00
Dave Airlie	f0254fdd07	mesa/get: drop unused extension checks. These all show up as unused warnings here, so drop them for now. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-28 05:29:23 +10:00
Bas Nieuwenhuizen	4717d5a2d3	gallium/ddebug: Add passthrough for query_memory_info. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-27 20:00:07 +02:00
Jason Ekstrand	0482efdc93	nir/inline: Also rewrite param derefs for texture instructions Without this, samplers get left hanging as derefs to variables that don't actually exist. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-27 10:28:27 -07:00
Jason Ekstrand	2522180845	nir/inline: Break the guts of rewrite_param-derefs into a helper Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-27 10:28:27 -07:00
Jason Ekstrand	d19c406395	nir/inline: Make the rewrite_param_derefs helper work on instructions Now that we have the better nir_foreach_block macro, there's no reason to use the archaic block version for everything. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-27 10:28:27 -07:00
Jason Ekstrand	2fcba404f8	nir/inline: Don't use foreach_instr_safe unless we need to Suggested-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-27 10:28:27 -07:00
Roland Scheidegger	9247570d42	gallivm: eliminate a unnecessary AND with unorm lerps Instead of doing a add and then mask out the upper bits, we can simply do a add with a half wide type (this, of course, assumes the hw can actually do it...), so we'll get the required zero in the upper bits automatically. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-27 19:11:28 +02:00
Roland Scheidegger	17d685c426	gallium/util: use enum pipe_prim_type instead of unsigned some more There were complaints from a mingw build: u_draw.h:134:14: error: invalid conversion from ‘uint {aka unsigned int}’ to ‘pipe_prim_type’ [-fpermissive] Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-27 19:11:28 +02:00
Brian Paul	2318d2015a	svga: remove unneeded casts in get_query_result_vgpu9() calls Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-27 10:01:12 -06:00
Brian Paul	9be122e9b0	svga: use MAYBE_UNUSED to silence release-build warnings Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-27 10:00:56 -06:00
Ben Widawsky	8314dd7ff2	isl: Fix some tautological-compare warnings Fixes: isl.c:62:22: warning: self-comparison always evaluates to true [-Wtautological-compare] assert(ISL_DEV_GEN(dev) == dev->info->gen); ^~ isl.c:63:33: warning: self-comparison always evaluates to true [-Wtautological-compare] assert(ISL_DEV_USE_SEPARATE_STENCIL(dev) == dev->use_separate_stencil); Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-26 21:59:17 -07:00
Ilia Mirkin	4ccf8c952a	mesa: add support for GLSL ES 3.20 version string Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-26 21:25:53 -04:00
Ilia Mirkin	faae9ab2ee	mapi: expose new functions in GL ES 3.2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-26 21:25:53 -04:00
Ilia Mirkin	df2881381a	nvc0/ir: handle a load's reg result not being used for locked variants For a load locked, we might not use the first result but the second result is the predicate result of the locking. In that case the load splitting logic doesn't apply (which is designed for splitting 128-bit loads). Instead we take the predicate and move it into the first position (as having a dead result in first def's position upsets all sorts of things including RA). Update the emitters to deal with this as well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-26 21:23:49 -04:00
Ilia Mirkin	04ecad97ff	nvc0/ir: avoid generating illegal instructions for compute constbuf loads For user-supplied constbufs, fileIndex is 0. In that case, when we subtract 1, we'll end up loading from constbuf offset -16. This is illegal, and there are asserts to avoid it. Normally we'd just DCE it, but no point in generating the instructions if they're not going to be used. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-26 21:23:49 -04:00
Rob Clark	4f98c94be7	gallium/util: fix build break Missing #include caused build breaks after `21a3fb9cd`. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-26 20:59:08 -04:00
Jason Ekstrand	9f9f229359	nir/spirv: Allow pointless variable decorations on inputs SPIR-V specifies that a bunch of stuff gets applied to types. This means taht a local variable could get, for instance, an array stride. Just because it's pointless doesn't mean you'll never see it.	2016-05-26 17:10:50 -07:00
Brian Paul	1ec45a1948	gallium/util: use enum pipe_prim_type in u_prim.h functions Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	7a49b41436	util/indices: move duplicated assignments out of switch cases Spotted by Roland. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	46be65c681	gallium: change pipe_draw_info::mode to be pipe_prim_type Makes debugging with gdb a little nicer. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	a25ae485a6	util/indices,svga: s/unsigned/enum pipe_prim_type/ Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	21a3fb9cd8	util: s/unsigned/enum pipe_resource_usage/ for buffer usage variables Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	45078e8890	svga: s/unsigned/enum pipe_resource_usage/ for buffer usage variables Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	d21a309c6c	svga: s/unsigned/enum pipe_prim_type/ for primitive type variables Proper enum types were only added recently. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	90afd7b7ef	svga: fix test for unfilled triangles fallback VGPU10 actually supports line-mode triangles. We failed to make use of that before. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	2c07c40d2f	svga: clean up and improve comments in svga_draw_private.h Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	0f983e1793	util/indices: implement unfilled (tri->line) conversion for adjacency prims Tested with new piglit gl-3.2-adj-prims test. v2: re-order trisadj and tristripadj code, per Roland. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	d6c2c7d710	util/indices: implement provoking vertex conversion for adjacency primitives Tested with new piglit gl-3.2-adj-prims test. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	479d364c39	util/indices: assert that the incoming primitive is a triangle type The unfilled index translator/generator functions should only be called when the primitive mode is one of the triangle types. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	26de558072	util/indices: formatting, whitespace fixes in u_unfilled_indices.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	24eadb4810	util/indices: improve comments in u_indices.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	5393238765	svga: fix primitive mode (point/line/tri) test for unfilled primitives The original mode test was valid before we had GS support. Regression tested with full piglit run. Though, I don't think we have any piglit tests that exercise drawing unfilled adjacency primitives. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-26 17:44:17 -06:00
Ian Romanick	b7af108d3e	i965: Enable GL_OES_shader_io_blocks Only one dEQP io_blocks test fails. This test fails for the same reason as the match_different_member_struct_names test in a previous commit. dEQP-GLES31.functional.separate_shader.validation.io_blocks.match_different_member_struct_names v2: Add to release notes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	660240da9e	glsl: Allow shader interface blocks in GLSL ES Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	7a3093efcc	glsl: Add a has_shader_io_blocks helper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	f0902ee813	mesa: Add extension tracking for GL_OES_shader_io_blocks v2: Also support GL_EXT_shader_io_blocks. It's pretty much identical to the OES extension. Suggested by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	326a269c77	mesa: Only validate SSO shader IO in OpenGL ES or debug context v2: Move later in series to avoid issues with Gallium drivers and debug contexts. Suggested by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-26 16:23:53 -07:00
Ian Romanick	3722c76001	mesa: Remove old validate_io function The new validate_io catches all of the cases (and many more) that the old function caught. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:22:25 -07:00
Ian Romanick	bd3f15cffd	mesa: Additional SSO validation using program_interface_query data Fixes the following dEQP tests on SKL: dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_smooth_fragment_flat dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_implicit_explicit_location_1 dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_array_element_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_none dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_order dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_centroid_fragment_flat dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_array_length dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_precision dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_explicit_location_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_centroid dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_explicit_location dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_smooth dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_name It regresses one test: dEQP-GLES31.functional.separate_shader.validation.varying.match_different_struct_names Hoever, this test is based on language in the OpenGL ES 3.1 spec that I believe is incorrect. I have already submitted a spec bug: https://www.khronos.org/bugzilla/show_bug.cgi?id=1500 v2: Move spec quote about built-in variables to the first place where it's relevant. Suggested by Alejandro. v3: Move patch earlier in series, fix rebase issues. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v2] Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> [v2]	2016-05-26 16:21:01 -07:00
Ian Romanick	cfff746297	mesa: Track the additional data in gl_shader_variable The interface type, interpolation mode, precision, the type of the outermost structure, and whether or not the variable has an explicit location will be used for SSO validation on OpenGL ES. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:19:16 -07:00
Jason Ekstrand	15e553daf0	nir: Make nir_const_value a union There's no good reason for it to be a struct of an anonymous union. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96221 Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-26 16:03:44 -07:00
Kenneth Graunke	e7776fa947	i965: Use the buffer object size for VERTEX_BUFFER_STATE's size field. commit `7c8dfa78b9` (i965/draw: Use the real size for vertex buffers) changed how we programmed the VERTEX_BUFFER_STATE size field. Previously, we programmed it to the size of the actual underlying BO, which is page-aligned, and potentially much larger than the GL buffer object. This violated the ARB_robust_buffer_access spec. With that change, we started programming it based on the range of data we expect the draw call to actually access - which is based on the min_index and max_index information provided to glDrawRangeElements(). Unfortunately, applications often provide inaccurate range information to glDrawRangeElements(). For example, all the Unreal demos appear to draw using a range of [0, 3] when the index buffer's actual index range is [0, 5]. Such results are undefined, and we are absolutely allowed to restrict access to the range they specified. However, the failure mode is usually that nothing draws, or misrendering with wild geometry, which is kind of bad for a common mistake. And people tend to assume the range information isn't that important when data is in VBOs. There's no real advantage, either. ARB_robust_buffer_access only requires us to restrict access to the GL buffer object size, not the range of data we think they should access. Doing that allows buggy applications to still function. (Note that we still use this information for busy-tracking, so if they try to overwrite the data with glBufferSubData, they'll still hit a bug.) This seems to be safer. We may want to provide the more strict range as a debug option, or scan the VBO and warn against bogus glDrawRangeElements in debug contexts. That can be done as a later patch, though. Makes Unreal demos draw again. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-26 15:56:41 -07:00
Samuel Pitoiset	e01a482182	nvc0: invalidate textures/samplers between 3D and CP on Fermi Like constant buffers, samplers and textures are aliased on Fermi and we need to invalidate the state when switching from 3D to CP and vice versa. This fixes rendering issues in the UE4 demos. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-26 23:51:22 +02:00
Jason Ekstrand	9f0bc0f2b3	anv: Stop linking against libmesa.la and libdri_test_stubs.la This brings the final size of an optimized non-debug build of the Vulkan driver down to 2.9 MB as opposed to 8.7 MB for the dri driver. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	057259655e	i965: Don't link libmesa or libdri_test_stubs into tests Now that the compiler has been completely separated from libmesa, we no longer need these. We can make the tests much smaller by not linking them in. This also ensures that anyone who runs make check won't accidentally put in any dependencies from the compiler to the rest of mesa core. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	870ff6cd38	i965: Move compiler debug functions to intel_screen.c They reference the compiler so they shouldn't go in libi965_compiler.la. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	327161a48d	i965/test: Remove the fragment/vertex_program field from test visitors None of them are actually using it. It's a relic of an older compiler interface that required a gl_program. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	e0ae10c49a	i965: Move brw_new_shader to brw_link.cpp That's where brw_link_shader lives and they seem to go together. Also, this gets it out of libi965_compiler. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	5136b67915	i965: Move brw_nir_lower_uniforms.cpp to i965_FILES This gets it out of i965_compiler.la Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	5e43ba7e9e	i965: Move brw_create_nir to brw_program.c This way it's no longer part of libi965_compiler.la since it depends on GLSL and ARB program stuff. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	86a2447eec	i965/nir: Move the type_size_*_bytes functions to brw_nir.h Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	58d1e82d32	ptn: Include nir.h Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	32210dea8e	compiler: Move glsl_to_nir to libglsl.la Right now libglsl.la depends on libnir.la so putting it in libnir.la adds a dependency on libglsl.la that goes the wrong direction. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Ben Widawsky	ddcfc35f62	i965/sklgt4: Implement depth/timestamp write w/a The stated bug describes a scenario in which a post sync write operation for depth or timestamp can be ignored. There are two workarounds suggested, the first and easier is to simply do a cs stall when we do these type of writes. The second option is to do a PIPE_CONTROL flush after the post sync but before the data is required. Generally, I believe the data written out is consumed by the application on the CPU side and so doing the easier of the two is ideal. Furthermore, these queries aren't tremendously common in the perf sensitive apps I have looked at. However, there could be cases where a shader stage might directly consume the data, and as a result option 2 may be desirable. This patch goes with the easier solution for now. gen9lp bug_de_id=2137196 By itself, this does not fix any of the GT4 hangs we're currently experiencing. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-26 14:08:17 -07:00
Ben Widawsky	f1fa8b4a1c	i965/bxt: Add 2x6 variant Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:06:43 -07:00
Bas Nieuwenhuizen	43d7305a40	radeonsi: Allow TES distribution between shader engines. The R_028B50_VGT_TESS_DISTRIBUTION value is copied from amdgpu-pro. Smaller values in the ACCUM fields seem to decrease the performance advantage from this patch, higher values don't seem to matter. v2: Add distribution mode field enums. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	f91c85b29b	radeonsi: Process multiple patches per threadgroup. Using more than 1 wave per threadgroup does increase performance generally. Not using too many patches per threadgroup also increases performance. Both catalyst and amdgpu-pro seem to use 40 patches as their maximum, but I haven't really seen any performance increase from limiting the number of patches to 40 instead of 64. Note that the trick where we overlap the input and output LDS does not work anymore as the insertion of the tess factors changes the patch stride. v2: - Add comment about LDS assumptions. - Add constant for buffer size. - Fix code style. v3: - Correct limits for not splitting patches between waves. - Set max num_patches to 40 as in the proprietary driver. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	fd0a7a382f	radeonsi: Add barrier before writing the tess factors. The factors may be stored to LDs by another invocation than the invocation for vertex 0. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	fee3160af9	radeonsi: Enable dynamic HS. This allows running the TES on different CU's than the TCS which results in performance improvements. v2: Only write the control word from one invocation. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	26f436132b	radeonsi: Remove LDS layout user SGPR's from TES. They are unused. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	a4e2146a9d	radeonsi: Use buffer loads and stores for passing data from TCS to TES. We always try to use 4-component loads, as LLVM does not combine loads and they bypass the L1 cache. We can't use a similar strategy for stores and this is especially notable with the tess factors, as they are often set with separate MOV's per component in the TGSI. We keep storing to LDS and the LDS space, so we can load the outputs later, either due to the shader, of for wrting the tess factors. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	6217716e8f	radeonsi: Store inputs to memory when not using a TCS. We need to copy the VS outputs to memory. I decided to do this using a shader key, as the value depends on other shaders. I also switch the fixed function TCS over to monolithic, as otherwisze many of the user SGPR's need to be passed to the epilog, which increases register pressure, or complexity to avoid that. The main body of the fixed function TCS is not that interesting to precompile anyway, since we do it on demand and it is very small. v2: Use u_bit_scan64. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	7846fa8768	radeonsi: Add offchip buffer address calculation. Instead of creating a memory area per patch and per vertex, we put the same attribute of every vertex & patch together. Most loads and stores access the same attribute across all lanes, only for different patches and vertices. For the TCS this results in tightly packed data for 4-component stores. For the TES this is not the case as within a patch the loads often also access the same vertex. However if there are < 4 vertices/patch, this still results in a reduction of the number of cache lines. In the LDS situation we only do better than worst case if the data per patch < 64 bytes, which due to the tessellation factors is pretty much never. We do not use hardware swizzling for this. It would slightly reduce the number of executed VALU instructions, but I had issues with increased wait times that I haven't been able to solve yet. Furthermore, the tbuffer_store intrinsic does not support both VGPR offset and an index, so we have a problem storing indirectly indexed outputs. This can be solved by temporarily storing arrays in LDS and then copying them, but I don't think that is worth the effort. The difference in VALU cycles hardware swizzling gives is about 0.2% of total busy cycles. That is without handling the array case. I chose for attributes instead of components as they are often accessed together, and the software swizzling takes VALU cycles for calculating offsets. v2: - Rename functions to get_tcs_tes_buffer_address. - multiply by 16 as late as possible. - Use tgsi_full_src_register_from_dst. - Remove some bad comments. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	c49e68dc4b	radeonsi: Add user SGPR for the layout of the offchip buffer. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	d9a0c54f6f	radeonsi: Use correct parameter index for LS_OUT_LAYOUT. This happens to be in the right position, but that changes when TCS/TES get new parameters. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	3e7a7a9a65	radeonsi: Add buffer load functions. v2: - Use llvm.admgcn.buffer.load instrinsics for new LLVM. - Code style fixes. v3: - Code style fix. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	9fdb778702	radeonsi: Define build_tbuffer_store_dwords earlier to support new users. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	5c34562d7c	radeonsi: Add offchip tessellation parameters. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	d27ff7d683	radeonsi: Add buffer for offchip storage between TCS and TES. The buffer is quite large, but should only be allocated if the application uses tessellation. Most non-games don't. v2: - Use the correct register for SI. - Add define for block size. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Rob Clark	6e51fe75a4	tgsi: fix coverity out-of-bounds warning CID 1271532 (#1 of 1): Out-of-bounds read (OVERRUN)34. overrun-local: Overrunning array of 2 16-byte elements at element index 2 (byte offset 32) by dereferencing pointer &inst.Dst[i]. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 15:17:49 -04:00
Rob Clark	3d66ba971e	tgsi: fix out of bounds access Not sure why coverity calls this an out-of-bounds read vs out-of-bounds write. CID 1358920 (#1 of 1): Out-of-bounds read (OVERRUN)9. overrun-local: Overrunning array r of 3 16-byte elements at element index 3 (byte offset 48) using index chan (which evaluates to 3). Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 15:17:49 -04:00
Anuj Phogat	0c02d7002d	i965: Don't use fast copy blit in case of logical operations other than GL_COPY XY_FAST_COPY_BLT command doesn't have a field for raster operation. So, fall back to using XY_SRC_COPY_BLT to handle those cases. Fixes piglit test gl-1.1-xor-copypixels when fast copy blit is enabled for all tiling formats. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-26 10:57:09 -07:00
Anuj Phogat	97f0f91cc1	i965/gen9: Remove the halign/valign field setup code in fast copy blit Experimentation with different values of src/dst horizontal/vertical alignment showed that these fileds are not used on gen9 hardware. A recent update in graphics specs has removed these fields from XY_FAST_COPY_BLT command. Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Chad Versace <chad.versace@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-26 10:57:09 -07:00
Samuel Pitoiset	c52e92ec3a	nvc0: allow to monitor MP perf counters with compute shaders To read out MP perf counters we use a compute shader and need to upload input data like a 64-bits addr used to store the values and a sequence ID for synchronization. Currently, this input data is uploaded as user uniforms which means that it's sticked to c0[], but if a compute shader from a real application is used, monitoring those performance counters will just overwrite some data and miserably crash. Instead, sticking the 64-bits addr and the sequence into the driver constant buffer seems like much better and will allow to monitor counters with GL 4.3 apps. Tested on GF119 and GK110, but should not hurt anything on GK104. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-26 19:34:57 +02:00
Kristian Høgsberg Kristensen	329d115ac6	mesa: Move robustness code to main/robustness.c Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 09:37:17 -07:00
Kristian Høgsberg Kristensen	d7d729b965	docs: Mark GL_KHR_robustness done for GLES3.2 as well Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 09:36:36 -07:00
Plamena Manolova	a0674ce5c4	egl: Additional attribute validation for eglCreatePbufferSurface eglCreatePbufferSurface should generate an EGL_BAD_MATCH error if: 1: The EGL_TEXTURE_FORMAT attribute is EGL_NO_TEXTURE and EGL_TEXTURE_TARGET is something other than EGL_NO_TEXTURE 2: EGL_TEXTURE_FORMAT is something other than EGL_NO_TEXTURE and EGL_TEXTURE_TARGET is EGL_NO_TEXTURE. This fixes the dEQP-EGL.functional.negative_api.create_pbuffer_surface test. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-26 08:02:48 -07:00
Marek Olšák	8539c9bf31	gallium/radeon: add the kernel version into the renderer string Example: Gallium 0.4 on AMD TONGA (DRM 3.2.0 / 4.5.0, LLVM 3.9.0) My kernel version is pretty long already (4.5.0-amd-01025-g32791c1) and adding "kernel" into the string would make too it long for glxinfo to display. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-26 16:53:46 +02:00
Marek Olšák	53f33619a4	winsys/amdgpu: add back multithreaded command submission Ported from the initial amdgpu winsys from the private AMD branch. The thread creates the buffer list, submits IBs, and cleans up the submission context, which can also destroy buffers. 3-5% reduction in CPU overhead is expected for apps submitting a lot of IBs per frame. This is most visible with DMA IBs. v2: use a semaphore instead of a busy loop in amdgpu_ws_queue_cs add another amdgpu_cs_sync_flush call into amdgpu_bo_map Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-26 16:43:45 +02:00
Lars Hamre	c626a86586	gallium/tgsi: use _mesa_roundevenf in micro_rnd Fixes the following piglit tests (for softpipe): /spec/glsl-1.30/execution/built-in-functions/... fs-roundeven-float fs-roundeven-vec2 fs-roundeven-vec3 fs-roundeven-vec4 vs-roundeven-float vs-roundeven-vec2 vs-roundeven-vec3 vs-roundeven-vec4 /spec/glsl-1.50/execution/built-in-functions/... gs-roundeven-float gs-roundeven-vec2 gs-roundeven-vec3 gs-roundeven-vec4 Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 07:59:15 -06:00
Emil Velikov	d519f59a9f	.mailmap: use Jakob Bornecrantz's personal email The VMware one is bouncing. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-26 13:57:32 +01:00
Ilia Mirkin	f998e5dc6b	nvc0: add note about where the viewport mask would go Not piping this all the way through yet, but no better place to note this down. This will can be used with NV_viewport_array2. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-26 08:46:29 -04:00
Ilia Mirkin	b634936d3b	nvc0: enable 32 textures on kepler+ For fermi, this likely will require use of linked tsc mode. However on bindless architectures, we can have as many as we want. As it stands, the AUX_TEX_INFO has 32 teture handles reserved. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-26 08:46:13 -04:00
Alejandro Piñeiro	2ed9563e79	glsl: add unit tests data vertex/expected outcome for uninitialized warning v2: fix 025 test. Add three more tests (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-26 09:19:36 +02:00
Alejandro Piñeiro	eee00274fa	glsl: add warning-test It executes compiler-glsl on all the available shaders, and it checks that the outcome is the expected. Bash code based on the already existing optimization-test v2: rebasing: use --version option Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-26 09:19:17 +02:00
Alejandro Piñeiro	68c23d2d04	glsl: add just-log option for the standalone compiler. Add an option in order to ask to just print the InfoLog, without any header or separator. Useful if we want to use the standalone compiler to track only the warning/error messages. v2: all printfs goes on its own line (Ian Romanick) v3: rebasing: move just_log to standalone.h/cpp Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-26 08:46:05 +02:00
Alejandro Piñeiro	66ff04322e	glsl: do not raise uninitialized warning with out function parameters It silence by default warnings with function parameters, as the parameters need to be processed in order to have the actual and the formal parameter, and the function signature. Then it raises the warning if needed at verify_parameter_modes where other in/out/inout modes checks are done. v2: fix comment style, multi-line condition style, simplify check, remove extra blank (Ian Romanick) v3: inout function parameters can raise the warning too (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-26 08:39:17 +02:00
Alejandro Piñeiro	b9f90ef652	glsl: add a empty set_is_lhs on ast_node Just to allow to call set_is_lhs on any ast_node without a casting. Useful when processing a ast_node list that we know it contain ast_expression. v2: comment out new_value to avoid unused parameter warning (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-26 08:39:07 +02:00
Dave Airlie	5b2675093e	glsl: handle implicit sized arrays in ssbo The current code disallows unsized arrays except at the end of an SSBO but it is a bit overzealous in doing so. struct a { int b[]; int f[4]; }; is valid as long as b is implicitly sized within the shader, i.e. it is accessed only by integer indices. I've submitted some piglit tests to test for this. This also has no regressions on piglit on my Haswell. This fixes: GL45-CTS.shader_storage_buffer_object.basic-syntax GL45-CTS.shader_storage_buffer_object.basic-syntaxSSO This patch moves a chunk of the linker code down, so that we don't link the uniform blocks until after we've merged all the variables. The logic went something like: Removing the checks for last ssbo member unsized from the compiler and into the linker, meant doing the check in the link_uniform_blocks code. However to do that the array sizing had to happen first, so we knew that the only unsized arrays were in the last block. But array sizing required the variable to be merged, otherwise you'd get two different array sizes in different version of two variables, and one would get lost when merged. So the solution was to move array sizing up, after variable merging, but before uniform block visiting. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-26 12:42:10 +10:00
Dave Airlie	4d70fd1bc7	glsl: fix error message on uniform block mismatch This looks like a cut-paste from above. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-26 12:40:41 +10:00
Dave Airlie	c952c0e713	glsl/ast: assign explicit_xfb_buffer from correct place This fixes: GL44-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.data_pass_through As the OUT_TC interface structures weren't matching because one of them had explicit_xfb_buffer set when it shouldn't. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-26 12:17:03 +10:00
Bruce Cherniak	c8835a5924	swr: [rasterizer] Correctly select optimized primitive assembly. Indexed primitives were always using cut-aware primitive assembly, whether primitive_restart was enabled or not. Correctly pass down primitive_restart and select optimized PA when possible. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-05-25 18:47:16 -05:00
Kenneth Graunke	978ab88858	docs: Mention i965/gen8+ supports GL 4.2 in release notes.	2016-05-25 14:22:56 -07:00
Kenneth Graunke	72ba9c3160	docs: Update GL_OES_copy_image status.	2016-05-25 14:22:30 -07:00
Kenneth Graunke	0f0f357b77	i965: Enable OES_copy_image (and EXT) on Gen8+ and Baytrail. For now, only enable it on platforms that actually support ETC2. At this point, Broadwell is only failing 5 (out of 8358) dEQP tests: dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits. srgb8_alpha8_r11f_g11f_b10f.renderbuffer_to_texture3d srgb8_alpha8_rgb10_a2ui.renderbuffer_to_cubemap srgb8_alpha8_rgb10_a2ui.renderbuffer_to_renderbuffer srgb8_alpha8_rgb10_a2.renderbuffer_to_texture2d srgb8_alpha8_rgb9_e5.renderbuffer_to_texture3d These fail with all methods (meta, blorp, blitter, memcpy). All are blacklisted from the Android mustpass list, which makes me wonder whether there's an issue with the tests. The formats in question work with other targets, and the targets in question work with other formats... Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	88a630121d	i965: Implement a BLORP path for CopyImage and prefer it over Meta. We're dropping Meta in favor of BLORP everywhere we can. This also fixes bugs when copying cubemaps to 2D, which is currently broken in the meta pass. BLORP just works. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94198 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	2822c8a078	i965: Make the CopyImage BLT path bail for stencil images. The BLT can't handle S8 because it's W-tiled (at least without additional funny business, and I'm not sure we care). Disallow it so it falls back to the CPU path, which works. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	c51702bdc8	i965: Also copy stencil miptree data. The Meta path handles this, but the CPU/BLT fallbacks did not. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	45d6818021	i965: Make a helper function for CopyImage of a miptree. Currently, it only contains the BLT/CPU fallbacks, so the name is a bit too generic. But eventually this will use BLORP as well, at which point the name will make more sense. The next patch will introduce a second call. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	2dc98d9a15	i965: Combine src/dest tex vs. rb checks in intel_copy_image_sub_data. This simplifies things a little - now we only have one (tex or rb?) if-ladder for src, and a second for dst, rather than four. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	1b39c5efca	i965: Account for MinLayer in CopyImageSubData's blitter/CPU paths. Fixes Piglit's arb_copy_image-texview test with the Meta path disabled (so we hit the blitter/CPU fallback paths). v2: Add MinLayer even for cube maps (suggested by Ilia). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Rob Clark	231dcb19f9	freedreno/ir3: cmdline compiler for glsl Use glsl/libstandalone.la to add support for taking glsl src files (in addition to .tgsi) as input. Then glsl->nir and feed the result into the ir3 backend as normal. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-25 16:31:15 -04:00
Rob Clark	0f982bb67d	glsl: split out libstandalone Split standalone glsl_compiler into a libstandalone.la and a thin main.cpp. This way drivers can re-use the glsl standalone frontend in their own standalone compilers. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-25 16:31:15 -04:00
Rob Clark	ec434d940d	android: drop build of standalone glsl_compiler It's only a tool for debugging the glsl compiler, and should not be installed. Signed-off-by: Rob Clark <robclark@freedesktop.org> Tested-by: Rob Herring <robh@kernel.org> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-25 16:31:15 -04:00
Matt Turner	61847d7708	i965: Mark fallthrough in switch statement. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Matt Turner	83c6749ddb	i965: Assert that a depth_mt exists when using HiZ. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Matt Turner	4a5e92ac70	nir: Strengthen assertion that 'out' is nonnull. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Matt Turner	44809f2371	spirv: Mark default cases unreachable(). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Matt Turner	469a1c56a6	isl: Mark default cases unreachable. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Matt Turner	47dca31606	isl: Remove useless qualifier from return type. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Samuel Pitoiset	71c30bd87c	nvc0: add descriptions for hardware perf counters/metrics The GALLIUM_HUD does not yet expose a description for each events, but this might be useful for developers who want to have a long description of hw perf counters directly in the source code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-25 21:06:49 +02:00
Brian Paul	89e4de20fa	mesa: 80-column wrapping for _context_lost_GetSynciv() Reviewed-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-05-25 12:23:12 -06:00
Brian Paul	ae7c4a6f98	mesa: add GLAPIENTRY to new _context_lost_X functions To fix MSVC build. Any function which goes into the dispatch table needs to have the GLAPIENTRY (__stdcall) tag. Reviewed-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-05-25 12:23:12 -06:00
Giuseppe Bilotta	1b62b47f6f	scons: support 2.5.0 The get_implicit_deps changed in SCons 2.5, expecting a callable rather than a path as third argument. Detect the SCons versions and set the argument appropriately to support both 2.5 and earlier versions. This closes #95211. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95211 Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Cc: mesa-stable@lists.freedesktop.org Acked-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-25 12:23:12 -06:00
Giuseppe Bilotta	8c00fe3970	scons: whitespace cleanup This text transformation was done automatically via the following shell command: $ find -name SCons\* -exec sed -i s/\\s\\+$// '{}' \; Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-25 12:23:12 -06:00
Alejandro Piñeiro	8c29bba242	i965/fs: take into account doubles when emitting system values Fixes the following cts test: GL42-CTS.vertex_attrib_64bit.limits_test Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-25 20:14:22 +02:00
Kristian Høgsberg Kristensen	89bb4be91e	i965: Fix shadowing of 'height' parameter The nested declaration of 'height' shadows a parameter and uses uninitialized memory. Fix by renaming to 'plane_height' which also makes the code clearer. This would typically break the bo size computation, but we don't use that except when mmaping, and we don't mmap YUV buffers much. Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Reported-by: Mathias Fröhlich <Mathias.Froehlich@gmx.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-25 09:42:55 -07:00
Kristian Høgsberg Kristensen	595224f714	mesa: Add .gitignore entries for make check binaries Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Acked-by: Matt Turner <mattst88@gmail.com>	2016-05-25 09:41:44 -07:00
Kristian Høgsberg Kristensen	85008db1d5	i965: Enable GL_KHR_robustness GL_KHR_robustness adds the GL_CONTEXT_LOST error and five new entry points that we already implement. This patch adds a new dispatch table that returns GL_CONTEXT_LOST from all entry points and implements the GL_LOSE_CONTEXT_ON_RESET strategy by setting that table when we learn that we've lost the context. With the GL_CONTEXT_LOST reporting in place and dispatch for the new entry points we can turn on GL_KHR_robustness. Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-25 09:41:44 -07:00
Emil Velikov	f036eea2cf	.mailmap: Use Chia-I Wu personal e-mail. The LunarG one is bouncing. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-25 17:38:06 +01:00
Emil Velikov	4b79f82836	.mailmap: Use my (Emil Velikov) personal e-mail. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-25 17:35:48 +01:00
Ilia Mirkin	21c1754306	docs: add missing GL_OES/EXT_gpu_shader5 enablement note Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-25 09:50:22 -04:00
Ilia Mirkin	601a5195eb	glsl: add GL_EXT_clip_cull_distance define, add helpers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2016-05-25 09:50:07 -04:00
Brian Paul	9690ab0cdf	tgsi: print TGSI_PROPERTY_NEXT_SHADER value as string, not an integer Print "GEOM" instead of "2", for example. v2: also update the text parsing code, per Ilia. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-25 07:21:23 -06:00
Brian Paul	2b773fcf00	tgsi: s/6/PIPE_SHADER_TYPES/ for tgsi_processor_type_names array size Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-25 07:21:23 -06:00
Jason Ekstrand	998829f404	nir/spirv: Handle location decorations on structure members	2016-05-24 21:12:56 -07:00
Jason Ekstrand	961369d597	nir/spirv: Add explicit handling for all decorations From time to time we have had cases where glslang has added a decoration we don't handle and it has caused problems. This audit ensures that, for every decoration, we either handle it or hit an unreachable() with an accurate description of why we don't have to.	2016-05-24 21:12:56 -07:00
Jason Ekstrand	6f89e51c84	i965/draw: Use the correct buffer index for interleaved VBO sizes The buffer_range_* arrays are indexed by buffer index not element index. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-24 20:50:35 -07:00
Jordan Justen	e58fabc93a	i965/gen7: Fix gl_HelperInvocation It appears that UV immediates aren't working on Ivy Bridge. In this case, a signed version will work, and this fixes the piglit tests/spec/glsl-4.50/execution/helper-invocation.shader_test test. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-24 15:44:06 -07:00
Emil Velikov	e384d75b12	mesa_glinterop: make GL interop version field bidirectional This allows clear and easy communication between the two. Caller: Requesting information (struct vN) Callee: I know how to deal with older version (vN-1) only. Here is your data and the version I support. Caller: Older version ? Sure I'll cap all access to the fields provided by the older version (vN-1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	0e983276b9	mesa_glinterop: drop mesa_glinterop_device_info::interop_version One cannot use a single version to control both export_in and export_out versions. Using this forces us to always extend/bump both structs at the same time. An alternative scheme is coming with next patch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	f8a114aa5c	st/dri: add note about GL interop version checks ... and make them more explicit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	923bdbf48c	mesa_glinterop: rename MESA_GLINTEROP_INVALID_{VALUE,VERSION} Be more explicit what it actually does. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	c196de23ae	mesa_glinterop: s/struct_version/version/ OCD polish for consistency with other mesa interfaces. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	cb0708c843	mesa_glinterop: fix GL interop *_VERSION comments Using the macro to set the version is wrong and ill-advised. Please don't do it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	a3eb8702fb	mesa_glinterop: remove inclusion of EGL header Analogous to previous commit, but for EGL. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	8472045b16	mesa_glinterop: remove inclusion of GLX header Since we only need partial information about the GLX symbols we can forward declare them and drop the include. Obviously each user of the said API will needs more than what's provides, so they'll include the GLX header. If they don't, the compiler will give us a nice warning ;-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	b5f9820d90	mesa_glinterop: remove unneeded GLAPI/GLAPIENTRY/APIENTRYP symbols These come from windows.h, gl.h, glcorearb.h and/or glext.h. The interop interface is aimed at non-Windows platforms while the macros are used/derived due to Windows specifics. Thus we can safely remove them. Strictly speaking there should be GLXAPIENTRY/EGLAPIENTRY and alike macros, although a) there is no GLX ones and b) this brings us even further from decoupling the file from the GLX/EGL header dependency. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	bcf9e47653	mesa_glinterop: replace GL types with their native counterpart. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:02:56 +01:00
Emil Velikov	2e726144f9	mesa_glinterop: use generic variable types for the GL interop Thus we can preserve the ABI, while avoiding the inclusion of some/all of the following: EGL/egl.h GL/gl.h GL/glcorearb.h GLES/gl.h GLES2/gl2.h GLES3/gl3.h GLES3/gl31.h This will allow us to build/use it alongside any combination of APIs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:02:08 +01:00
Emil Velikov	cbf29d90ba	mesa_glinterop: use consistent naming scheme for GL interop Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:02:08 +01:00
Emil Velikov	0d31bfd71a	Revert "mesa: Build EGL without X11 headers after interop patchset" This reverts commit `4e2c9a0435`. The solution was incomplete and fragile. An alternative one is coming shortly. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:02:05 +01:00
Ian Romanick	c8d9ed5ea1	docs: Note that GL_OES_geometry_shader and GL_OES_tessellation_shader are started The GL_OES_geometry_shader work is on the oes_shader_io_blocks branch of idr's fd.o repository. The GL_OES_tessellation_shader work is on the tess-gles branch of kwg's fd.o repository. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-24 12:45:46 -07:00
Emil Velikov	7e196cd170	c11/threads: resolve link issues with -O0 Add weak symbol notation for the pthread_mutexattr* symbols, thus making the linker happy. When building with -O1 or greater the optimiser will kick in and remove the said functions as they are dead/unreachable code. Ideally we'll enable the optimisations locally, yet that does not seem to work atm. v2: Add the AX_GCC_FUNC_ATTRIBUTE([weak]) hunk in configure. Cc: Alejandro Piñeiro <apinheiro@igalia.com> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Rob Herring <robh@kernel.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-05-24 20:21:31 +01:00
Tim Rowley	0ceed1701d	swr: [rasterizer] remove containers.hpp Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-24 13:29:37 -05:00
Tim Rowley	1e3e22efb5	swr: [rasterizer core] remove utility dead code Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-24 13:29:29 -05:00
Tim Rowley	dc34479b8c	swr: [rasterizer core] buckets fixes 1. Don't clear bucket descriptions to fix issues with sim level buckets getting out of sync. 2. Close out threadviz file descriptors in ClearThreads(). 3. Skip buckets for jitter based buckets when multithreaded. We need thread local storage through llvm jit functions to be fixed before we can enable this. 4. Fix buckets StopCapture to correctly detect capture complete. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-24 13:29:21 -05:00
Tim Rowley	3074a2b4fa	swr: [rasterizer core] move centroid setup out of CalcCentroidBarycentrics Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-24 13:29:14 -05:00
Tim Rowley	9a2a4ecb39	swr: [rasterizer jitter] implement InstanceID/VertexID in fetch jit Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-24 13:28:47 -05:00
Ian Romanick	7fc4a82007	mesa: Silence unused parameter warnings Neither shProg nor name was used. Remove them both. main/shader_query.cpp:779:53: warning: unused parameter ‘shProg’ [-Wunused-parameter] program_resource_location(struct gl_shader_program shProg, ^ main/shader_query.cpp:780:72: warning: unused parameter ‘name’ [-Wunused-parameter] struct gl_program_resource res, const char *name, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-24 11:04:08 -07:00
Ian Romanick	78399cf170	glsl/linker: Silence unused parameter warning The parameter is required for the interface. glsl/link_uniforms.cpp:689:61: warning: unused parameter ‘record_type’ [-Wunused-parameter] bool row_major, const glsl_type *record_type, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-24 11:04:05 -07:00
Kristian Høgsberg Kristensen	2bb935be2e	dri: Add YVU formats Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	1be1114e6b	i965: Allow creating planar YUV __DRIimages Lift the resctriction we had before and allow creation of images with multiple planes. We still require all the planes to be within the same bo. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	654e950cba	i965: Invoke lowering pass for YUV textures Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	44997fc0c1	i965: Support textures with multiple planes Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	3352f2d746	i965: Create multiple miptrees for planar YUV images Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	6eede87631	i965: Refactor intel_set_texture_image_bo() to create_mt_for_dri_image() This function now only creates the mt and we then call intel_set_texture_image_mt() in intel_image_target_texture_2d() to set it for the texture image. Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	8ceb7c7d9b	i965: Use intel_set_texture_image_mt() in intelSetTexBuffer2() Create the mt for the drawable bo directly and call our new intel_miptree_create_for_bo() helper instead. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen	40e9be4a5c	i965: Add new intel_set_texture_image_mt() helper This factors out the work of setting up a miptree as the backing for a texture image into a new helper. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen	a41b57679f	nir: Add a lowering pass for YUV textures This lowers sampling from YUV textures to 1) one or more texture instructions to sample each plane and 2) color space conversion to RGB. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen	50c24c3ff3	nir: Handle NULL in nir_copy_deref() Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen	29921ee987	nir: Add new 'plane' texture source type This will be used to select the plane to sample from for planar textures. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-24 10:14:56 -07:00
Brian Paul	39b7b8b906	mesa: log buffer ID numbers in decimal, not hexadecimal All the other error messages use decimal. Let's be consistent. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-24 10:26:26 -06:00
Brian Paul	ce1cc70e27	mesa: use enum name in bind_buffer_object() error message Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-24 10:26:26 -06:00
Brian Paul	55c19527a6	mesa: raise error for glEnable(GL_VERTEX_ARRAY), etc. in core profile Otherwise, if the call executes normally we'll hit an assertion later in the VBO code when we draw something. Note that these cases were already handled correctly for the glIsEnabled() function (and the API checks were copied from there). Tested with new piglit gl-3.1-enable-vertex-array test. v2: fix compat/es mix-up, per Ilia. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-24 10:26:26 -06:00
Nicolas Boichat	a9b2b5e241	docs/egl: Android platform can also be build using autotools We added support for Android build using autotools (configure), update the documentation to reflect that. Signed-off-by: Nicolas Boichat <drinkcat@google.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-24 16:24:54 +01:00
Juan A. Suarez Romero	e79aa19d88	i965: fix double-precision vertex inputs measurement For double-precision vertex inputs we need to measure them in dvec4 terms, and for single-precision vertex inputs we need to measure them in vec4 terms. For the later case, we use type_size_vec4() function. For the former case, we had a wrong implementation based on type_size_vec4(). This commit introduces a proper type_size_dvec4() function, that we use to measure vertex inputs. Measuring double-precision vertex inputs as dvec4 is required because ARB_vertex_attrib_64bit states that these uses the same number of locations than the single-precision version. That is, two consecutives dvec4 would be located in location "x" and location "x+1", not "x+2". Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-24 10:06:29 +02:00
Ilia Mirkin	ccd58015a2	docs: true up nvc0 status - images, etc Images aren't supported on maxwell, but neither is tessellation. Don't overly confuse matters by trying to expose those subtleties in the GL3.txt file/relnotes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Dave Airlie <airlied@redhat.com>	2016-05-23 23:47:11 -04:00
Ilia Mirkin	856587909c	st/mesa: enable ARB_ES3_1_compatibility when ES 3.1 would be exposed Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-23 23:47:11 -04:00
Ilia Mirkin	5878254545	mesa: remove separate enable for KHR_robust_buffer_access_behavior This extension appears to be a strict subset of the ARB version. Also remove it from GL3.txt since it doesn't seem relevant. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 23:47:11 -04:00
Timothy Arceri	72449c477e	glsl: add support for explicit components to frag outputs V2: fix error checking for arrays and components. V1 was only taking into account all the array elements and all the components of one of the varyings during the comparision and treating the other as a single slot/component. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-24 12:46:48 +10:00
Ilia Mirkin	37266dfb7c	mesa: add view classes for 3d astc formats Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-23 22:34:37 -04:00
Ilia Mirkin	979bcb9f42	glsl: add EXT_clip_cull_distance support based on ARB_cull_distance Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-23 22:22:06 -04:00
Ilia Mirkin	f236f1f506	nvc0: expose robust buffer access We apparently pass all the relevant CTS tests. There are probably some shortcomings, but they can be addressed down the line. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-23 22:22:05 -04:00
Jason Ekstrand	9f5ccaf4dc	i965: Use ISL for surface format introspection With this, we can delete the surface format table in brw_surface_formats.c because all of the information we need is now in ISL.	2016-05-23 19:12:34 -07:00
Jason Ekstrand	d68acde1cb	anv/formats: Use isl_format_supports* for format introspection	2016-05-23 19:12:34 -07:00
Jason Ekstrand	7374d006b6	isl: Add per-gen format introspection This is just a copy-and-paste from brw_surface_formats.c. For the supports_vertex_fetch function, we do a bit more work so that it properly handles Bay Trail.	2016-05-23 19:12:34 -07:00
Jason Ekstrand	03a82dc5d1	isl: Add the ISL_FORMAT_R32G32_FLOAT_LD format	2016-05-23 19:12:34 -07:00
Jason Ekstrand	35a514e6ff	isl: Add support for quering the string name of a format	2016-05-23 19:12:34 -07:00
Jason Ekstrand	75d10dff0b	i965: Enable ARB/KHR_robust_buffer_access_behavior on BYT and HSW+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	1a092fcf3b	main: Add extension enable bits for KHR_robust_buffer_access_behavior Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	66e137ecf1	nir/lower_samplers: Protect against sampler index overflow Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	27b9481d03	glsl: Add an option to clamp block indices when lowering UBO/SSBOs This prevents array overflow when the block is actually an array of UBOs or SSBOs. On some hardware such as i965, such overflows can cause GPU hangs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	ac242aac3d	glsl/linker: Add a helper variable for compiler options Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	aec10a1d5b	i965/draw: Use the real size for index buffers Previously, we were using the size of the whole BO which may be substantially larger than the actual index buffer size. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	7c8dfa78b9	i965/draw: Use the real size for vertex buffers Previously, we were using the size of the BO which may be substantially larger than the actual vertex buffer size. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	a643bc6246	i965/draw: Use 3-channel formats for vertex fetch when possible. For a long time, several of the 3-channel vertex formats didn't exist so we faked them with 4-channel versions. Starting with Sandy Bridge, we can use R16G16B16_FLOAT and 8 and 16-bit integer formats become available on Haswell and Bay Trail. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	ab3d8d5ea4	i965/surface_formats: Update the VB column for new formats added on BYT Bay Trail and Haswell added a bunch of new vertex formats. There was also the addition of 64-bit passthrough formats for BDW+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	d5b4ab2c5f	i965/draw: Properly handle rounding when dividing by InstanceDivisor The old code always divided rounded down and then subtracted 1. What we wanted was to divide rounded up and then subtract 1 which is equivalent to subtracting 1 and then dividing rounded down. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	ad42ab473c	i965/draw: Account for BaseInstance in VBO bounds Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	ad3deec8ca	i965/draw: Use worst-case VBO bounds if brw->num_instances == 0 Previously, we only handled the "I don't know what's going on" case for things with InstanceDivisor == 0. However, in the DrawIndirect case we can get num_instances == 0 and we don't know what's going on with the instanced ones either. This commit makes the worst-case bound the default and then conservatively tightens the bound. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	8892519751	i965/draw: Delay when we get the bo for vertex buffers The previous code got the BO the first time we encountered it. However, this can potentially lead to problems if the BO is used for multiple arrays with the same buffer object because the range we declare as busy may not be quite right. By delaying the call to intel_bufferobj_buffer, we can ensure that we have the full range for the given buffer. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	a01a1eb9e4	i965/draw: Stop relying on min_index == -1 for invalid index bounds The vbo layer passes an index_bounds_valid flag that we should be using instead. This also fixes a bug when min_index == -1 and basevertex != 0 where we were actually comparing min_index + basevertex == -1 which was false and we were getting the wrong buffer-sizing path. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	a7011922f1	vbo: Declare the index range invalid for DrawTransformFeedback Right now, we're setting the range to [0, 0] which is obviously bogus. Instead, we should set it to be invalid like we do for DrawIndirect. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	df6ec2aba5	vbo: Declare the index range invalid for DrawIndirect Right now, we're just setting the range to [0, MAX_UINT32] which, while correct isn't helpful. With DrawIndirect, you can't really know what the actual range is so we may as well flag it as being an invalid range. This is what we do for draws with index buffer which is similar (the indices aren't statically known) if a bit simpler. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Ilia Mirkin	21f3df0820	mesa/teximage: fix GL_FLOAT in comment Noticed by Brian. Trivial. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-23 21:44:41 -04:00
Timothy Arceri	2d9308012c	glsl: fix explicit location validation for doubles Previously we would fail to find a match for the second half of a dvec4 as 'i' would get incremented to 1 before we added the var to the array at component 0. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-24 11:30:51 +10:00
Dave Airlie	33397bf7fd	docs: update ARB_cull_distance status. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 11:27:58 +10:00
Dave Airlie	5c10d47bae	st/mesa: reenable culling Now the lowering pass is fixed, reenable ARB_cull_distance. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 11:27:54 +10:00
Dave Airlie	a88c5d7e55	i965: reenable ARB_cull_distance. Now the lowering pass is fixed we can reenable culling. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 11:27:29 +10:00
Dave Airlie	a08c4ebbe8	glsl: rewrite clip/cull distance lowering pass The last version of this broke clipping, and I had to spend sometime getting this working properly. I had to introduce a third pass to count the clip/cull totals, all due to one messy corner case. We have a piglit test tes-input-gl_ClipDistance.shader_test that doesn't actually output the clip distances, it just passes them like a varying from TCS->TES, the older lowering pass worked but to lower clip/cull we need to know the total number of clip+culls used to defined the new variable correctly, and to offset culls properly. This adds an extra pass that works out the sizes for clip/cull, then lowers gl_ClipDistance then gl_CullDistance into the new gl_ClipDistanceMESA. The pass checks using the fixed array sizes code if they array has been referenced, or is actually never used, and ignores it in the latter case. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 11:27:29 +10:00
Dave Airlie	8c628ab13e	glsl: make max array trackers ints and use -1 as base. (v2) This fixes a bug that breaks cull distances. The problem is the max array accessors can't tell the difference between an never accessed unsized array and an accessed at location 0 unsized array. This leads to converting an undeclared unused gl_ClipDistance inside or outside gl_PerVertex to a size 1 array. However we need to the number of active clip distances to work out the starting point for the cull distances, and this offset by one when it's not being used isn't possible to distinguish from the case were only the first element is accessed. I tried to use ->used for this, but that doesn't work when gl_ClipDistance is part of an interface block. So this changes things so that max_array_access is an int and initialised to -1. This also allows unsized arrays to proceed further than that could before, but we really shouldn't mind as they will get eliminated if nothing uses them later. For initialised uniforms we no longer change their array size at runtime, if these are unused they will get eliminated eventually. v2: use ralloc_array (Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 11:27:29 +10:00
Nanley Chery	2ae493d686	anv/formats: Make alpha blending a property of render targets In agreement with the SNB PRM, alpha blending is a property that render targets may or may not support. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 17:26:17 -07:00
Nanley Chery	9721be6681	i965: Unset alpha blend for R10G10B10_SNORM_A2_UNORM This format does not support alpha blending, according to the SNB PRM. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 17:26:17 -07:00
Dave Airlie	8b89c92ef6	i965: deindent blorp code. gcc6 warns about this. Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 10:14:31 +10:00
Dave Airlie	e257284481	glsl: reindent line in ast_function.cpp This fixes a warning with gcc -Wmisleading-indentation. Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 10:14:31 +10:00
Ilia Mirkin	82d756f3af	mesa: allow GL_FRAMEBUFFER_DEFAULT_LAYERS to be queried with ES geometry When we have the geometry extensions, enable querying of the new param. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-23 20:03:40 -04:00
Ilia Mirkin	2dabd49704	mesa: allow xfb to be active in GLES when geometry shader is enabled. OES_geometry_shader has wording to allow xfb when using Draw*Indirect and DrawElements. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-23 20:03:20 -04:00
Ilia Mirkin	2e8e1e8909	main: check driver float texture support before upgrading to 16F/32F When passing in GL_RGBA or other base formats, we will try to upgrade the format to whatever the passed in type was. However not all drivers (notably nv30) support 32F textures, and so this would lead to crashes down the line. Only upgrade when the relevant extensions are available. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-23 20:00:39 -04:00
Ilia Mirkin	1e99a46b44	st/mesa: update inst->info along with inst->op Otherwise we still have TGSI_OPCODE_CMP's info, which causes a number of later logic to go wrong. This fixes dEQP-GLES2.functional.shaders.functions.control_flow.return_in_if_vertex on nv30. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-23 19:58:53 -04:00
Bas Nieuwenhuizen	533d1e9085	glsl: Use correct mode for split components. The mode should stay the same as the original struct. In particular, shared should not be changed to temporary. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-24 09:55:38 +10:00
Kenneth Graunke	1c1873b93b	mesa: Implement glGet*(GL_PRIMITIVE_RESTART_FOR_PATCHES_SUPPORTED). Technically, this was introduced with GL 4.4. However, I believe it was intended to be retroactive. As far as I know, AMD has never supported primitive restart with patches, while NVidia and Intel do. This necessitated the need for a query which would allow applications to figure out whether this was usable or not. I decided to expose it everywhere ARB_tessellation_shader is exposed. (It's also in both OES and EXT_tessellation_shader.) Enable this for i965 and Gallium drivers which expose the capability. v2: Fix a bug in the state_tracker code (caught by Ilia Mirkin). Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10364 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-23 16:44:22 -07:00
Kenneth Graunke	70048eb1e3	gallium: Add a pipe cap for whether primitive restart works for patches. Some hardware supports primitive restart on patch primitives, and other hardware does not. Modern GL and ES include a query for this feature; adding a capability bit will allow us to answer it. As far as I know, AMD hardware does not support this feature, while NVIDIA and Intel hardware does. However, most Gallium drivers do not appear to support tessellation shaders yet. So, I've enabled it for nvc0 and disabled it everywhere else. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-23 16:44:11 -07:00
Francisco Jerez	015035027b	i965/fs: Mark UBO uniform pull constant loads as force_writemask_all. This lets the rest of the backend know that the uniform pull constant load opcodes don't respect channel enables -- Without this the register allocator has no way to know that the return payload of a pull constant load is not per-channel and spills of the destination will be broken under non-uniform control flow. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:07:23 -07:00
Francisco Jerez	7eb4966887	i965/fs: Allow spilling of non-contiguous registers. This should be working fine now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94997 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:21 -07:00
Francisco Jerez	6fc5dd5b6a	i965/fs: Calculate the (un)spill block size correctly. Currently the spilling code attempts to guess the scratch message block size from the dispatch width of the shader, which is plain wrong for SIMD-lowered instructions (frequently but not exclusively encountered in SIMD32 shaders) or for instructions with register region data types of size other than 32 bit. Instead try to use the SIMD component size of the instruction which in some cases will allow the dataport to apply the correct channel mask to the scratch data read or written. In the spill case the block size needs to be clamped to the number of MRF registers reserved for spilling. In the unspill case I didn't even bother because we currently have no 100% accurate way to determine whether a source region is per-channel or whether it contains things like headers that don't respect channel boundaries -- That's fine, because the unspill is marked force_writemask_all we can just use the largest allowable scratch message size. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:21 -07:00
Francisco Jerez	11260cc54f	i965/fs: Set exec_all on spills not matching the channel layout of the instruction. This prevents the application of an incorrect channel mask by the scratch write instruction for spilled variables that don't have an exact one-to-one correspondence between channels of the variable and 32-bit components of the scratch write instruction. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:21 -07:00
Francisco Jerez	bb67c467a4	i965/fs: Set exec_all on unspills. This makes sure that unspills restore the exact contents of the variable in scratch space into the GRF without applying channel masking, which is incorrect under control flow for things like message headers or vectors of heterogeneous types that don't properly respect channel boundaries. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Francisco Jerez	07e67cc266	i965/fs: Move scratch block size calculation into the caller of emit_(un)spill. This makes emit_(un)spill even more stupid by removing the logic that decides what execution size each scratch read or write send message should have and instead relying on the caller to specify an appropriate execution size via the builder argument. This makes sense because the caller will need to act differently based on the scratch message width (e.g. emit an additional unspill before the instruction if the execution width and channel layout of the spill doesn't match the instruction's). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Francisco Jerez	284c8fbcef	i965/fs: Make emit_spill/unspill static functions taking builder as argument. This seems cleaner than exposing an implementation detail of brw_fs_reg_allocate.cpp to the world, and will give the caller control over the instruction execution flags (e.g. force_writemask_all) that are applied to the scratch read and write instructions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Francisco Jerez	70023c40c6	i965/fs: Apply execution controls from the instruction to scratch messages. Until now the execution controls (e.g. channel group, force_writemask_all, exec_size) of the instruction had been completely ignored by spilling, even though that can lead to a mismatch between the channel mask applied to the contents of the (un)spilled memory and the GRF source or destination of the instruction. In some cases we'll actually want the (un)spill messages to be marked force_writemask_all regardless of whether the instruction has it set, but that will have to be handled specially by the caller. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Francisco Jerez	e98cf03114	i965/fs: Fix signedness of local variables and arguments of emit_(un)spill. To avoid some some spurious warnings about comparison signedness in the following commits. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Francisco Jerez	f471d3eede	i965/fs: Factor out calculation of the block of MRFs reserved for spilling. And as we're at it fix the calculation to allocate a larger block of registers for 32-wide dispatch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Plamena Manolova	21edd24c0d	egl: Add OpenGL_ES to API string regardless of GLES version According to the EGL specifications eglQueryString(EGL_CLIENT_APIS) should return a string containing a combination of "OpenGL", "OpenGL_ES" and "OpenVG", any other values would be considered invalid. Due to this when the API string is constructed, the version of GLES should be disregarded and "OpenGL_ES" should be attached once instead of "OpenGL_ES2" and "OpenGL_ES3". Fixes: dEQP-EGL.functional.negative_api* and dEQP-EGL.functional.query_context.simple.query_api Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-23 13:46:01 -07:00
Rob Clark	46ff17559b	freedreno/ir3: disable cp for indirect src's The variable-indexing tests always had a few random fails, which I usually couldn't reproduce when running tests manually. Somehow recently this got a lot worse. I ported a couple of the shaders to GLES to see what blob does, and it also seems to be avoiding to cp indirect srcs. So I guess indirect w/ instructions other than cat1 (mov) are not totally reliable. Let's just switch that off until this is better understood. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-23 15:57:13 -04:00
Samuel Pitoiset	c3c4370299	nvc0: do not invalidate compute constbufs on Kepler Constbufs are only aliased on Fermi and this will reduce the number of flushes when we switch between 3d and compute. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-23 20:56:29 +02:00
Rob Clark	5245d845b6	nir/validate: fix null deref coverity warning CID 1265536 (#1 of 2): Explicit null dereferenced (FORWARD_NULL)6. var_deref_op: Dereferencing null pointer parent. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 10:14:50 -04:00
Nicolas Boichat	0cbc90c57c	mesa: dri: Add shared glapi to LIBADD on Android /system/vendor/lib/dri/*_dri.so actually depend on libglapi: without this, loading the so file fails with: cannot locate symbol "__emutls_v._glapi_tls_Context" On non-Android (non-bionic) platform, EGL uses the following workflow, which works fine: dlopen("libglapi.so", RTLD_LAZY \| RTLD_GLOBAL); dlopen("dri/<driver>_dri.so", RTLD_NOW \| RTLD_GLOBAL); However, bionic does not respect the RTLD_GLOBAL flag, and the dri library cannot find symbols in libglapi.so, so we need to link to libglapi.so explicitly. Android.mk already does this. Signed-off-by: Nicolas Boichat <drinkcat@google.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil Velikov: s/explicitely/explicitly/] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-23 13:25:51 +01:00
Nicolas Boichat	27d713a004	configure.ac: Add support for Android builds Add support for EGL android platform. Also, detect when --host finishes with -android. In that case, we do not set _GNU_SOURCE, and define autoconf symbol HAVE_ANDROID, so that Android-specific workarounds can be applied. Signed-off-by: Nicolas Boichat <drinkcat@google.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil Velikov: Rebase on top of HAVE_EGL_PLATFORM_NULL removal] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-23 13:23:39 +01:00
Emil Velikov	960d854a98	anv: remove define _DEFAULT_SOURCE The build systems already add this as applicable. There's no need to have this in the source file. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:09:11 +01:00
Emil Velikov	1b64d1247d	gbm: remove define _DEFAULT_SOURCE The build systems already add this as applicable. There's no need to have this in the source file. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:09:11 +01:00
Emil Velikov	efe4beb717	gbm: remove define _BSD_SOURCE The build systems already add this as applicable. There's no need to have this in the source file. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:09:11 +01:00
Jiri Slaby	a6ce91fe52	glxcmds: glXGetFBConfigs, fix screen bounds Bounds of screen are 0 (inclusive) and ScreenCount(dpy) (exclusive). The upper bound was too ScreenCount(dpy) (inclusive). This causes a crash invoked by java3d which passes down an invalid screen: 6 0x00007f0e5198ba70 in <signal handler called> () at /lib64/libc.so.6 7 0x00007f0e14531e14 in glXGetFBConfigs (dpy=<optimized out>, screen=1, nelements=nelements@entry=0x7f0dab3c522c) at glxcmds.c:1660 8 0x00007f0e14532f7f in glXChooseFBConfig (dpy=<optimized out>, screen=<optimized out>, attribList=0x7f0dab3c54e0, nitems=0x7f0dab3c535c) at glxcmds.c:1611 9 0x00007f0e1478d29b in find_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so 10 0x00007f0e1478d3dc in find_S_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so 11 0x00007f0e1478d567 in find_AA_S_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so 12 0x00007f0e1478d728 in find_DB_AA_S_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so 13 0x00007f0e1478d97c in Java_javax_media_j3d_X11NativeConfigTemplate3D_chooseOglVisual () at /usr/lib64/libj3dcore-ogl.so While ScreenCount(dpy) is actually 1: (gdb) p dpy->nscreens $2 = 1 screen=1 is passed to glXGetFBConfigs. Fix this typo in glXGetFBConfigs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95456 Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-23 12:07:47 +01:00
Elie TOURNIER	0f738fa23e	doxygen: Add missing modules to Windows runner Acked-by: Rhys Kidd <rhyskidd@gmail.com>	2016-05-23 12:07:47 +01:00
Emil Velikov	793574afad	egl: add missing link against $(CLOCK_LIB) Some platforms require separate library in order to resolve the clock_gettime() symbol. Add the link or the build will fail. Fixes: `70299474f5` ("egl: add EGL_KHR_reusable_sync to egl_dri") Cc: Dongwon Kim <dongwon.kim@intel.com> Reported-by: Pali Rohár <pali.rohar@gmail.com> Tested-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:07:47 +01:00
Emil Velikov	d67e757d11	egl: android: remove explicit glFlush call The DRI flush extension should already do the same thing. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Herring <robh@kernel.org>	2016-05-23 12:07:47 +01:00
Emil Velikov	9b3c7481c6	egl: android: drop dri2_create_image_android_native_buffer argument The drv is no longer used/needed as of last commit. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Herring <robh@kernel.org>	2016-05-23 12:07:47 +01:00
Emil Velikov	38ef6f5f60	egl: android: directly use dri2_create_image_dma_buf() Make the function non static so that we can use it directly from the android platform code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Herring <robh@kernel.org>	2016-05-23 12:07:47 +01:00
Emil Velikov	2cd687ce97	configure.ac: error out when building from git without python3 Bail early, as opposed to later on during the build. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-23 12:07:47 +01:00
Emil Velikov	a155cdaace	vl/drm: don't call close(-1) in vl_drm_screen_create error path Analogous to previous commits. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-05-23 12:07:47 +01:00
Emil Velikov	ed3f6ccce0	st/xa: don't call close(-1) in xa_tracker_create error path Analogous to previous commit. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-05-23 12:07:46 +01:00
Emil Velikov	6e00a1e6cb	st/dri: don't call close(-1) in dri{2, kms_}_init_screen error path Add separate labels and jump to the correct one as needed. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-05-23 12:07:46 +01:00
Eric Engestrom	7362bb3e21	vk/intel: use negative VK_NO_PROTOTYPES scheme `3d0fac7aca` changed all VK_PROTOTYPES to VK_NO_PROTOTYPES This brings the Intel header in line with the rest of the Vulkan code. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-05-23 12:07:46 +01:00
Rob Herring	8aeb6d768b	gbm: Add map/unmap functions This adds map and unmap functions to GBM utilizing the DRIimage extension mapImage/unmapImage functions or existing internal mapping for dumb buffers. Unlike prior attempts, this version provides a region to map and usage flags for the mapping. The operation follows the same semantics as the gallium transfer_map() function. This was tested with GBM based gralloc on Android. Signed-off-by: Rob Herring <robh@kernel.org> [Emil Velikov: drop no longer relevant hunk from commit message.] Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	1f4869a208	configure.ac: add pthreadstubs support Add pthreadstubs to avoid pulling in full pthreads library. GBM will be the first user. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	0a4275b534	gbm: rename gbm_dri_bo_{map,unmap} to gbm_dri_bo_{map,unmap}_dumb In preparation to add public map/unmap functions, rename the existing gbm_dri_bo_{map,unmap} functions to indicate that they are only for dumb buffers. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:07:46 +01:00
Rob Herring	e8431a630d	st/dri: Add support for DRIimage extension mapImage/unmapImage Implement support for mapImage/unmapImage functions in version 12 of the DRIimage extension. Signed-off-by: Rob Herring <robh@kernel.org> [Emil Velikov: align/indent the map/unmap vfuncs] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	a0f06f168f	DRI: Add DRIimage map and unmap functions Add mapImage and unmapImage functions to DRIimage extension for mapping and unmapping DRIimages for CPU access. The caller provides the region of the image to map and is returned a pointer to the beginning of the region and the stride (which could be different from the original). Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	bdfa635f72	gbm: Add Android build support In order to use libgbm for gralloc, add it to the Android build. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	64a005e3ee	gbm: add Android gallium_dri.so library loading support GBM needs the same special gallium_dri.so loading as EGL for Android, so copy over the same hunk from the EGL code. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	7d79eec456	gbm: split out source file to Makefile.sources In preparation to add Android build support, split out the source file lists to Makefile.sources Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net> [Emil Velikov: Whitespace cleanup.] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:07:46 +01:00
Rob Herring	fc1806e041	Android: Move setting DEFAULT_DRIVER_DIR to shared location Move the defining of DEFAULT_DRIVER_DIR path to a common location so both EGL and GBM can use it. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:45 +01:00
Emil Velikov	6ce11e7e2c	c11/threads: create mutexattrs only when needed If the mutexattrs are the default one can just pass NULL to pthread_mutex_init. As the compiler does not know this detail it unnecessarily creates/destroys the attrs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-23 12:07:45 +01:00
Andres Gomez	4424bf5da4	configure: added xcb to dri3 modules to pkg-conf This fixes a recent linking error in libvulkan_common Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-05-23 11:21:34 +02:00
Juan A. Suarez Romero	3c9096eea4	glsl/linker: dvec3/dvec4 consume twice input vertex attributes From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes): "A program with more than the value of MAX_VERTEX_ATTRIBS active attribute variables may fail to link, unless device-dependent optimizations are able to make the program fit within available hardware resources. For the purposes of this test, attribute variables of the type dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may count as consuming twice as many attributes as equivalent single-precision types. While these types use the same number of generic attributes as their single-precision equivalents, implementations are permitted to consume two single-precision vectors of internal storage for each three- or four-component double-precision vector." This commits makes dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3 and dmat4 consume twice as many attributes as equivalent single-precision types. v3: count doubles as consuming two attributes (Dave Airlie) v4: make reference to spec (Michael Schellenberger Costa) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2016-05-23 10:48:07 +02:00
Francisco Jerez	b46867cd37	i965/fs: do not depend on std140 alignment rules for UBO loads The previous implementation relied on the std140 alignment rules to avoid handling misalignment in the case where we are loading more than 2 double components from a vector, which requires to emit a second load message. This alternative implementation deals with misalignment and is more flexible going forward. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-05-23 08:56:57 +02:00
Iago Toral Quiroga	38b719d624	nir: handle double-precision in fsign, fsat, fnot and frcp I think these are not strictly necessary since the floats in them should be automatically promoted to doubles when operated with double sources, but it makes things more explicit at least. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-23 08:54:37 +02:00
Iago Toral Quiroga	3f73039ade	nir: handle double-precision in fabs, frsq and fsqrt Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-23 08:54:28 +02:00
Dave Airlie	3466db3969	glsl/parser: handle multiple layout sections with AST nodes. For geometry/compute inputs and tess control outputs, we create an AST node to keep track of some things. However if we have multiple layout sections, we don't ever link the node into the AST. This is because we create the node on the rightmost layout declaration and don't pass it back in so it gets linked at the end of the parsing of the rightmost. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:20:01 +10:00
Dave Airlie	aaa69c79cd	glsl: allow layout qualifier overrides with ARB_shading_language_420pack GLSL 4.20 allows overriding the layout qualifiers. This helps fix: GL45-CTS.shading_language_420pack.qualifier_override_layout Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	6f2dc0d044	subroutines: handle explicit indexes properly The code didn't deal with explicit function indexes properly. It also handed out the indexes at link time, when we really need them in the lowering pass to create the correct if ladder. So this patch moves assigning the non-explicit indexes earlier, fixes the lowering pass and the lookups to get the correct values. This fixes a few of: GL45-CTS.explicit_uniform_location.subroutine-index-* Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	5fe912831c	mesa/subroutines: fix reset on bindpipeline Fixes: GL45-CTS.shader_subroutine.subroutine_uniform_reset Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	7fa0250f94	mesa/subroutines: count number subroutines properly. The code was implementing the ACTIVE_SUBROUTINE_UNIFORMS incorrectly, using the number of types not the number of uniforms. This is different than the locations as the locations may be sparsly allocated. This fixes: GL43-CTS.shader_subroutine.four_subroutines_with_two_uniforms Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	22db9b10eb	mesa/subroutines: don't generate error in GetSubroutineIndex. GLSL spec says this doesn't generate an error. Fixes: GL45-CTS.explicit_uniform_location.subroutine-loc Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	3b8b6be7bb	glsl/ast: for geom shaders allow stream flags in input flags. This fixes: GL45-CTS.shader_subroutine.subroutines_with_separate_shader_objects Since we set the stream flags earlier on all geom shaders, we shouldn't fall over later if we find one. Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	93b3b6af3c	glsl/linker: skip inactive explicit locations. This fixes a crash in: GL45-CTS.explicit_uniform_location.subroutine-loc-negative-link-max-num-of-locations Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	c714731653	glsl: fix subroutine uniform .length(). This fixes .length() on subroutine uniform arrays, if we don't find the identifier normally, we look up the corresponding subroutine identifier instead. Fixes: GL45-CTS.shader_subroutine.arrays_of_arrays_of_uniforms GL45-CTS.shader_subroutine.arrayed_subroutine_uniforms Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	432ac19c1a	glsl/linker: link error on too many subroutine functions. This fixes: GL45-CTS.explicit_uniform_location.subroutine-index-negative-link-max-num-of-indices Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:56 +10:00
Dave Airlie	18b0a13e80	glsl: produce a linker error for a subroutine uniform with no functions. If a subroutine uniform is declared with no functions backing it, that isn't legal, so we should fail to link. Fixes: GL43-CTS.shader_subroutine.subroutine_uniform_wo_matching_subroutines Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:56 +10:00
Dave Airlie	b572b599ef	glsl: validate subroutine types match function signature. This fixes: GL43-CTS.shader_subroutine.subroutines_incompatible_with_subroutine_type It just makes sure the signatures match as well as the return types. Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:56 +10:00
Dave Airlie	ba3414d832	arb_shader_subroutine: check active subroutine limit _mesa_GetActiveSubroutineUniformiv needs to check against the number of types here. Noticed while playing with ogl conform. Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:18:25 +10:00
Ilia Mirkin	74e71cbfcb	nv30: don't assert when running out of registers This happens with dEQP tests. The code doesn't at all protect against this condition, so while unhandled, this is an expected situation. Also avoid using more than the first 16 registers for nv3x vertex programs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-22 22:57:18 -04:00
Ilia Mirkin	36ff09cdfe	nouveau: allow allocating non-object-backed buffers On nv30, for example, there is no hardware index buffer support. So all of those will be created entirely in user memory. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-22 22:57:18 -04:00
Tobias Klausmann	96f390ff35	llvm/softpipe: Enable cull_distance as draw supports it. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:04:37 +10:00
Dave Airlie	e6d9389366	tgsi: remove culldist semantic. This isn't used anymore in the tree, culldist's are part of the clipdist semantic, we could in theory rename it, but I'm not sure there is much point, and I'd have to be careful with virgl. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:03:44 +10:00
Dave Airlie	d17062a40e	draw: stop using CULLDIST semantic. The way the HW works doesn't really fit with having two semantics for this. The GLSL compiler emits 2 vec4s and two properties, this makes draw use those instead of CULLDIST semantics. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:03:40 +10:00
Emil Velikov	bddb3b5375	virgl: remove unused state_tracker/graw.h include Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:02:17 +10:00
Dave Airlie	62c728f7d8	mesa/queryobject: return INVALID_VALUE if offset < 0 (v2) This fixes: GL45-CTS.direct_state_access.queries_errors The ARB_direct_state_access spec agrees. v2: move check down further (Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 07:33:03 +10:00
Samuel Pitoiset	a7fad12931	nvc0/ir: fix indirect access for images When the array doesn't start at 0 we need to account for su->tex.r. While we are at it, make sure to avoid out of bounds access by masking the index. This fixes GL45-CTS.shading_language_420pack.binding_image_array. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reported-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-22 23:06:16 +02:00
Ilia Mirkin	cb9a51d1f6	nv30: reset the stencil mask when fast-clearing Apparently the stencil mask applies to clears on nv30/nv40. Reset it to 0xff before doing a stencil clear. This fixes gl-1.0-readpixsanity and a number of other piglit tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-22 14:48:56 -04:00
Ilia Mirkin	f57a8440d5	nv30,nv50: add PIPE_SHADER_CAP_PREFERRED_IR support The mesa state tracker has recently started to query this. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-22 14:05:36 -04:00
Ilia Mirkin	9f19ccff9c	nvc0: fix setting of tess_mode in various situations This fixes a lot of INVALID_VALUE errors reported by the card when running dEQP tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-22 11:58:22 -04:00
Ilia Mirkin	d6edae7090	nv50/ir: fix prog info init Left over from the pre-mainline tess support. Adapt to use the new defines. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-22 11:58:22 -04:00
Ilia Mirkin	035b1097db	nvc0/ir: return 0 for gl_TessCoord.z for non-triangles modes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-22 11:58:22 -04:00
Matt Turner	bdc9c20df0	mesa: Unlock mutex on error path. Caught by Coverity (CID 1362021). Caused by commit `015f2207c`.	2016-05-22 07:01:35 -07:00
Timothy Arceri	a83e9afbe4	i965: remove redundant NULL check We would have segfaulted in the above code if prog could be NULL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-22 23:08:08 +10:00
Eduardo Lima Mitev	7dce4793b7	anv/nir_apply_pipeline_layout: Pass the nir_src from the nir_tex_src nir_instr_rewrite_src() expects a nir_src and it is currently being fed a nir_tex_src. This will crash something. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-21 19:57:31 +02:00
Samuel Pitoiset	30b93141aa	nvc0: expose GLSL version 420 on GF100 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:33:06 +02:00
Samuel Pitoiset	d04050071d	nvc0: enable ARB_shader_image_load_store on GF100 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:33:03 +02:00
Samuel Pitoiset	362e17a712	nvc0/ir: add a lowering pass for surfaces on Fermi Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:32:58 +02:00
Samuel Pitoiset	b663db44ba	nvc0/ir: add emission for SULDB and SUSTx Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:32:56 +02:00
Samuel Pitoiset	cd88d1a171	nvc0/ir: add emission for OP_SULEA Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:32:54 +02:00
Samuel Pitoiset	8aa1fd321d	nv50/ir: fix tex constraints for surface coords on Fermi Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:32:49 +02:00
Ilia Mirkin	be4caaf247	nv50/ir: use moveSources to condense sources This makes sure that rIndirectSrc and other things stay updated. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-21 18:32:46 +02:00
Samuel Pitoiset	879bd2ea0c	nvc0: bind images on fragment and compute shaders for Fermi Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:32:41 +02:00
Samuel Pitoiset	e7d2ef42a5	nvc0/ir: don't check the format for surface stores on Kepler Initially to make sure the format doesn't mismatch and won't produce out-of-bounds access, we checked that both formats have exactly the same number of bytes, but this should not be checked for type stores. This fixes serious rendering issues in the UE4 demos (tested with realistic and reflections). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 16:50:28 +02:00
Samuel Pitoiset	5e32cc9192	nv50/ir: fix a comment in canDualIssue() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 16:50:25 +02:00
Samuel Pitoiset	70834d05cd	nv50/ir: fix SUSTx constraints on Kepler To prevent out-of-bounds access and format mismatch we add a predicate on sustp, but we have to account for it when the sources are condensed because a predicate is a source. Using the range 3:6 will only condense the input data and it's always the case. This also fixes constraints when an indirect access is used. This ensures that sources are correctly aligned. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 16:06:14 +02:00
Kenneth Graunke	9c0d16adc1	i965: Just read the existing tally on EndTransformFeedback if paused. If the transform feedback object is paused when ending, then there are no new snapshots to add to the tally. In fact, we haven't written a starting snapshot, so we'd best not try and compute (end - start). Just load the existing tally so we can convert it to the number of vertices written and store it to the final result location. This is the Haswell+ equivalent of the previous commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-20 19:55:42 -07:00
Kenneth Graunke	915f7c25fa	i965: Don't write a counter snapshot on EndTransformFeedback if paused. If the transform feedback object is paused, then we've already written an ending counter snapshot. We don't want to write another one. This fixes assertions in GL33-CTS.transform_feedback.api_errors_test, which calls EndTransformfeedback after PauseTransformFeedback. On the next BeginTransformFeedback, we tried to tally up the results, and saw an odd number of snapshots (due to the double-end), and tripped an assertion. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-20 19:55:40 -07:00
Kenneth Graunke	47fbe178fa	mesa: Call TransformFeedback driver hooks before setting flags. This way, the driver's EndTransformFeedback() hook can tell whether the transform feedback operation was paused. It's also convenient to have Paused remain false until the driver's PauseTransformFeedback hook finishes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-20 19:55:26 -07:00
Kenneth Graunke	f7eb95a526	nir: Fix crash in nir_lower_wpos_center(). Otherwise we rewrote the fadd to use itself, causing crashes in validation. Instead, start after the last use like we should. A brown paper bag fix. Fixes crashes in several Vulkan tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-20 16:33:24 -07:00
Dave Airlie	0970c563d6	nir: remove dead glsl variables before lowering io. For cull distance GLSL will let unsized unused arrays get into the backend, we should nuke those straight away, to save caring about them later. This fixes: arb_separate_shader_objects/linker/large-number-of-unused-varyings as a side effect (even without culling changes). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-21 08:56:45 +10:00
Kenneth Graunke	de45da6a8c	spirv: Handle the PixelCenterInteger execution mode. This isn't allowed by Vulkan, but might be useful someday for SPIR-V in OpenGL (if that ever becomes a thing). It's easy enough to hook up, and as precedent, we already do so for OriginLowerLeft. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 14:44:22 -07:00
Kenneth Graunke	9b8b3f7501	i965: Delete dead dFdy flipping code. Rob's nir_lower_wpos_ytransform() pass flips dFdy in the opposite case of what I expected, so we always take the negate_value case. It doesn't really matter. v2: Write src0 before src1 in ADD instructions (requested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:30:09 -07:00
Kenneth Graunke	08bc74e694	i965: Delete brw_wm_prog_key::render_to_fbo and drawable_height. Now that we handle flipping and other gl_FragCoord transformations via a uniform, these key fields have no users. This patch actually eliminates the associated recompiles. The Tomb Raider benchmark's minimum FPS increases from ~1 FPS to a reasonable number. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:30:09 -07:00
Kenneth Graunke	dac10e8a13	i965, anv: Use NIR FragCoord re-center and y-transform passes. This handles gl_FragCoord transformations and other window system vs. user FBO coordinate system flipping by multiplying/adding uniform values, rather than recompiles. This is much better because we have no decent way to guess whether the application is going to use a shader with the window system FBO or a user FBO, much less the drawable height. This led to a lot of recompiles in many applications. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:30:08 -07:00
Kenneth Graunke	6e5d86c07a	nir: Add a simple nir_lower_wpos_center() pass for Vulkan drivers. nir_lower_wpos_ytransform() is great for OpenGL, which allows applications to choose whether their coordinate system's origin is upper left/lower left, and whether the pixel center should be on integer/half-integer boundaries. Vulkan, however, has much simpler requirements: the pixel center is always half-integer, and the origin is always upper left. No coordinate transform is needed - we just need to add <0.5, 0.5>. This means that we can avoid using (and setting up) a uniform. I thought about adding more options to nir_lower_wpos_ytransform(), but making a new pass that never even touched uniforms seemed simpler. v2: Use normal iterator rather than _safe variant (noticed by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:30:00 -07:00
Kenneth Graunke	12ab7fc6ac	nir: Don't use ffma in nir_lower_wpos_ytransform(). ffma is an explicitly fused multiply add with higher precision. The optimizer will take care of promoting mul/add to fma when it's beneficial to do so. This fixes failures on Gen4-5 when using this pass, as those platforms don't actually implement fma(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	b8b1b1c34c	nir: Handle fddy_fine and fddy_coarse in nir_lower_wpos_ytransform. These also need flipping! Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	4b7577fad8	nir: Make lower_wpos_ytransform_block a void function. The return value was used for the old nir_foreach_block callback system, but at this point it no longer means anything. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	88ea960aa7	nir: Make nir_lower_wpos_ytransform() match FragCoord by location. gl_FragCoord is a shader input with location == VARYING_SLOT_POS. ARB_fragment_programs have an equivalent input at VARYING_SLOT_POS, but it isn't called gl_FragCoord. We do want to transform it. Matching by location guarantees we catch both. Fixes several fp tests on a branch which uses this pass on i965. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	c9192fcbd2	nir: Add interp_var_at_offset flipping. The Y-offset needs flipping as well, similar to ddy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	287f099db1	nir: Fix fddy swizzles in nir_lower_wpos_ytransform(). The original value might have been swizzled. That's taken care of in the fmul source - we don't want to reswizzle it again. Fixes validation failures in glsl-derivs-varyings on a branch of mine which uses this pass in i965. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	7fe9a19302	nir: Fix wpos_ytransform lowering state_slot swizzle. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:28:30 -07:00
Kenneth Graunke	1539009bf0	i965: Fix brw_regs_equal() for NaN and positive/negative zero. We'd like the comparisons to mean "the exact same bits". Comparing doubles won't do that for NaN values or positive vs. negative zero. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:28:06 -07:00
Dave Airlie	b19a0d506d	virgl: handle cull distance cap. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-21 06:19:54 +10:00
Rob Herring	2235b80f2a	virgl: Add missing texture transfer_inline_write transfer_inline_write cannot be NULL and the virgl renderer doesn't support inline writes for textures, so add the default version. This fixes a crash in st_TexSubImage since commit `fb9fe352ea` ("st/mesa: use transfer_inline_write for memcpy TexSubImage path"). Cc: Marek Olšák <marek.olsak@amd.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-21 06:07:18 +10:00
Kristian Høgsberg Kristensen	12dc89d844	anv: Merge in my TODO list items Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-05-20 10:35:57 -07:00
Matt Turner	015f2207cf	mesa: Replace uses of Shared->Mutex with hash-table mutexes We were locking the Shared->Mutex and then using calling functions like _mesa_HashInsert that do additional per-hash-table locking internally. Instead just lock each hash-table's mutex and use functions like _mesa_HashInsertLocked and the new _mesa_HashRemoveLocked. In order to do this, we need to remove the locking from _mesa_HashFindFreeKeyBlock since it will always be called with the per-hash-table lock taken. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-20 10:05:09 -07:00
Matt Turner	aded1160e5	hash: Add _mesa_HashRemoveLocked() function. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-20 10:05:09 -07:00
Matt Turner	fb5dcb81cc	i965: Pass nir_src/nir_dest by reference. Cuts 6K of .text. text data bss dec hex filename 5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before `5766074` 264648 29320 6060042 5c780a lib/i965_dri.so after Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 10:04:06 -07:00
Mark Janes	9ca5ec2a31	glsl: Guard against NULL dereference This trivially corrects mesa `3ca1c221`, which introduced a check that crashes when a match is not found. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005 Fixes: piglit.spec.glsl-1_50.compiler.interface-blocks-name-reused-globally-4.vert Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-20 09:52:49 -07:00
Nanley Chery	9b8c4000d0	anv: Enable textureCompressionASTC_LDR on Gen9+ Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	0d2847e177	anv/format: Reorder ASTC mappings to match ISL enum ordering Keep the lists consistent for ease of use. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	f3ed3a0a15	genxml: Expand SKL's SurfaceFormat field width for ASTC In the expanded field, only ASTC format enums have the MSB set to 1. Expanding the field width makes the process of handling these formats identical to the way other formats are handled. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	a141576887	isl: Handle npot ASTC block dimensions on Gen9+ Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	de86fb875d	isl: Add 2D ASTC format layouts and enums Also, make changes needed for successful compilation and registration as a texture compression mode. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Youry Metlitsky	4e2c9a0435	mesa: Build EGL without X11 headers after interop patchset Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-20 08:44:18 -07:00
Rob Clark	df361fc58c	nir/validate: assume() that hashtable entry exists At this point, it would require a logic error in nir_validate to not have already populated this hashtable entry, but coverity doesn't realize that: CID 1265547 (#1 of 1): Dereference null return value (NULL_RETURNS)3. dereference: Dereferencing a null pointer entry. CID 1271039 (#1 of 1): Dereference null return value (NULL_RETURNS)3. dereference: Dereferencing a null pointer entry. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	fcd6b3f42b	nir: coverity unitialized pointer read Not sure how coverity arrives at the conclusion that we can read comp[j] unitialized (around line 204), other than not being aware that ncomp is greater than 1 so it won't underflow in the 'if (tex->is_array)' case. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	53c48feae0	nir: coverity sign-extension fix Not 100% sure, but I think being an unsigned literal will help: CID 1358505 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)sign_extension: Suspicious implicit sign extension: load1->def.num_components with type unsigned char (8 bits, unsigned) is promoted in load1->def.num_components * (load1->def.bit_size / 8) to type int (32 bits, signed), then sign-extended to type unsigned long (64 bits, unsigned). If load1->def.num_components * (load1->def.bit_size / 8) is greater than 0x7FFFFFFF, the upper bits of the result will all be 1. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	bb993da795	nir/glsl_to_nir: quell some uninit_member coverity errors Signed-off-by: Rob Clark <robclark@freedesktop.org> Acked-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	3a1bbd6a0a	freedreno/ir3: need to lower fmod too Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-20 11:13:50 -04:00
Mark Janes	a2d28ddc01	i965: Fix strerror error code sign This trivial fix to error-handling corrects the sign of drm error codes before passing them to strerror. Identified by Coverity: CID1358581	2016-05-20 05:58:18 -07:00
Jason Ekstrand	eb384daae8	nir/spirv: Handle the NonReadable decoration on struct members	2016-05-19 21:18:59 -07:00
Jason Ekstrand	ea8c11fdc2	anv/pipeline: Bounds-check resource indices when robuts_buffer_access is enabled	2016-05-19 21:18:59 -07:00
Jason Ekstrand	902628bce6	anv/pipeline: Only do buffer bounds checks if robustBufferAccess is enabled	2016-05-19 21:18:59 -07:00
Jason Ekstrand	23090b51e0	anv/apply_dynamic_offsets: Use rewrite_src instead of a regular assignment Originally we removed the instruction, changed the source, and then re-inserted it. This works, but nir_instr_rewrite_src is a bit more obviously correct.	2016-05-19 21:18:59 -07:00
Jason Ekstrand	c29ffea6d1	anv/device: Add a boolean for robust buffer access	2016-05-19 21:18:59 -07:00
Jason Ekstrand	d5b4638d6a	anv: Add a TODO file	2016-05-19 20:09:31 -07:00
Dave Airlie	3ca1c2216d	glsl: handle same struct redeclaration (v2) This works around a bug in older version of UE4, where a shader defines the same structure twice. Although we aren't sure this is correct GLSL (it most likely isn't) there are enough UE4 based things out there we should deal with this. This drops the error to a warning if the struct names and contents match. v1.1: do better C++ on record_compare declaration (Rob) v2: restrict this to desktop GL only (Ian) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-20 11:22:52 +10:00
Matt Turner	8a65b5135a	i965/fs: Recognize and emit ld_lz, sample_lz, sample_c_lz. Ken suggested instead of a big and complicated optimization pass, to just recognize the operations here. It's certainly less code and a lot prettier, but it seems to actually perform worse for currently unknown reasons. total instructions in shared programs: 8923452 -> 8904108 (-0.22%) instructions in affected programs: 814563 -> 795219 (-2.37%) helped: 3336 HURT: 10 total cycles in shared programs: 66970734 -> 66651476 (-0.48%) cycles in affected programs: 10582686 -> 10263428 (-3.02%) helped: 2438 HURT: 691 total spills in shared programs: 1811 -> 1789 (-1.21%) spills in affected programs: 85 -> 63 (-25.88%) helped: 4 total fills in shared programs: 3143 -> 3109 (-1.08%) fills in affected programs: 167 -> 133 (-20.36%) helped: 4 LOST: 2 GAINED: 36 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-19 17:27:49 -07:00
Matt Turner	75dccf5ac2	i965: Add infrastucture for sample lod-zero operations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-19 17:27:49 -07:00
Matt Turner	07353599e0	i965/fs: Add and use get_nir_src_imm(). The next patch wants to inspect the LOD argument and do something different if it's 0.0f. But at that point we've emitted a MOV for it and we just have a register to look at. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-19 17:27:49 -07:00
Ilia Mirkin	8bf5493899	nvc0: account for shader-allocated local memory needs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-19 20:20:23 -04:00
Ilia Mirkin	5c6b8cc7d0	nv50/ir: treat addresses as local Address registers are always loaded right before use. Don't treat them as "global", which will cause them to be put into the function's linkage, and will make the register allocator hold onto that register until the end of the function. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-19 20:20:23 -04:00
Tim Rowley	65c2abf6fd	swr: [rasterizer] utility functions for shared libs Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:27:18 -05:00
Tim Rowley	6deb9f7f2c	swr: [rasterizer jitter] fix assert in AVX implementation of MASKLOADD llvm changed the mask type to vector of ints with 3.8. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:27:12 -05:00
Tim Rowley	600528168b	swr: [rasterizer core] apply KNOB_TOSS_DRAW to more functions Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:27:06 -05:00
Tim Rowley	6d212cccf0	swr: [rasterizer jitter] add instancing to non-gather fetch path Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:27:01 -05:00
Tim Rowley	63d7ed835a	swr: [rasterizer core] move MultisampleTrait static from header to cpp Move a MultisampleTrait static from header to cpp as clang seemed to get confused with some specializations in the header vs some in cpp. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:54 -05:00
Tim Rowley	c969ef2d42	swr: [rasterizer core] clang override for _mm_undefined* Not supported in older xcode versions. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:49 -05:00
Tim Rowley	da75160039	swr: [rasterizer common] add OSX to unix portability sections Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:44 -05:00
Tim Rowley	4997169779	swr: [rasterizer] rename _aligned_malloc to AlignedMalloc Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:38 -05:00
Tim Rowley	2e4ef23523	swr: [rasterizer jitter] rename MEMCPY function to MEMCOPY Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:30 -05:00
Tim Rowley	aebbd2f7dd	swr: [rasterizer common] guard definition of __cdecl/__stdcall Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:24 -05:00
Tim Rowley	82e335ce67	swr: [rasterizer common] include cstddef for offsetof Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:19 -05:00
Tim Rowley	759d8cf3a3	swr: [rasterizer core] removed tabs that snuck in Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:14 -05:00
Tim Rowley	8e39d410f1	swr: [rasterizer core] code style cleanup Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:08 -05:00
Tim Rowley	b914217c25	swr: [rasterizer core] add dummy code for cygwin build Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:02 -05:00
Tim Rowley	a0747c4ce3	swr: [rasterizer core] move variable query outside loop Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:25:54 -05:00
Tim Rowley	f2a1f894ba	swr: [rasterizer core] utility function for getenv Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:25:48 -05:00
Tim Rowley	4a58b21ef7	swr: [rasterizer common] portable threadviz buckets Output with slashes instead of backslashes for unix/linux. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:25:30 -05:00
Tim Rowley	2031baffb5	swr: [rasterizer common] foreground win32 assert dialog Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:25:24 -05:00
Tim Rowley	33d4c2c798	swr: [rasterizer core] use parens to disambiguate operator precedence Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:25:06 -05:00
Tim Rowley	9475251145	swr: standardize linkage and check for unresolved symbols Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-19 13:27:33 -05:00
Tim Rowley	6423004d85	swr: fix swr linkage so that static llvm works Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-19 13:27:33 -05:00
Tim Rowley	8987460b9e	swr: PIPE_CAP_CULL_DISTANCE cap request response Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 13:27:33 -05:00
Tim Rowley	78572c9b0b	docs: add swr to GL3.txt v2: not on gl3.3 list until gl3.2 is complete Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 13:27:17 -05:00
Leo Liu	2f90d11d86	st/va: use drm render node for wayland display type With xwayland, vainfo use VA_DISPLAY_WAYLAND as default and it fails and fails when specify display with `vainfo --display wayland`. In fact wayland support for libva uses drm path to connect device, and should use drm pipe loader to create screen. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-19 09:40:33 -04:00
Marek Olšák	f6742859b7	gallium/radeon: small cleanups in r600_texture_transfer_map Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-19 12:35:50 +02:00
Marek Olšák	54737aabb9	gallium/radeon: don't set PB_USAGE in winsyses There is no point. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-19 12:35:50 +02:00
Marek Olšák	f330b7a14f	gallium/radeon: handle VRAM_GTT placements as having slow CPU reads not sure if we should include GTT WC too Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-19 12:35:50 +02:00
Marek Olšák	5e14d0ac2c	gallium/radeon: ignore PIPE_TRANSFER_MAP_DIRECTLY Only st/xa is using this, which is irrelevant to us. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-19 12:35:50 +02:00
Marek Olšák	51cf04cf0e	radeonsi: add a workaround for a bug in LLVM <= 3.8 This is not directly applicable to stable and needs to be backported. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-19 12:35:50 +02:00
Eduardo Lima Mitev	7671687713	i965/fs: Silence warnings related to use of uninitialized values brw_fs.cpp: In function ‘const unsigned int* brw_compile_fs(const [...] brw_fs.cpp:6093:64: warning: ‘simd16_grf_start’ may be used uninitialized [...] prog_data->base.dispatch_grf_start_reg = simd16_grf_start; brw_fs.cpp:5996:29: note: ‘simd16_grf_start’ was declared here uint8_t simd8_grf_start, simd16_grf_start; brw_fs.cpp:6094:52: warning: ‘simd16_grf_used’ may be used uninitialized [...] prog_data->reg_blocks_0 = brw_register_blocks(simd16_grf_used); brw_fs.cpp:5997:29: note: ‘simd16_grf_used’ was declared here unsigned simd8_grf_used, simd16_grf_used; (and more) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-19 09:05:18 +02:00
Eric Anholt	a507dcc160	vc4: Size transfer temporary mappings appropriately for full maps of 3D. We don't really support reading/writing of 3D textures since the hardware doesn't do 3D, but we do need to make sure that a pipe_transfer for them has enough space to store the image. This was previously not a problem because the state tracker only mapped a slice at a time until `fb9fe352ea`. Fixes glean glsl1 tests, which all have setup of a 3D texture at the start.	2016-05-18 17:30:07 -07:00
Nanley Chery	7ac08adfb4	anv/device: Fix viewportBoundsRange Align with the spec requirement that the range must be at least [−2 × maxViewportDimensions, 2 × maxViewportDimensions − 1]. Our hardware supports this. Fixes dEQP-VK.api.info.device.properties Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-18 16:01:50 -07:00
Dave Airlie	61b6789252	glsl/linker: attempt to match anonymous structures at link This is my attempt at fixing at least one of the UE4 bugs with GL4.3. If we are doing intrastage matching and hit anonymous structs, then we should do a record comparison instead of using the names. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-19 08:16:50 +10:00
Mark Janes	4dfa89e33c	anv/batch_chain: free pointers for error cases Trivial fix to improperly handled cleanup during VK_ERROR_OUT_OF_HOST_MEMORY. Identified by Coverity: CID 1358908 and 1358909 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-18 15:14:22 -07:00
Wang He	f21b7d1e5c	st/nine: Minor change to support musl libc A few changes to support musl libc as well. In particular fpu_control.h is glibc specific. fenv.h doesn't enable to do exactly what we want either, so instead use assembly directly. Signed-off-by: Wang He <xw897002528@gmail.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	de39231134	st/nine: Enable D3DPMISCCAPS_PERSTAGECONSTANT Nine already supports the feature. There are no failing WINE tests for per stage constants. Enabling D3DPMISCCAPS_PERSTAGECONSTANT as it fixes https://github.com/iXit/Mesa-3D/issues/205 Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	839f417634	st/nine: Turn on thread_submit by default when on different device The last remaining issues with thread_submit have been resolved, thus turn it when on a different device (the case where is is beneficial). Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	9cae3cdc89	st/nine: Fix usage of rasterizer multisample bit. pipe_rasterizer multisample bit should be enabled only when really wanting to do multisampling, thus we should disable when not having msaa render target. This fixes some depth calculation precision issues on radeon. Also disable it when depth and stencil tests are disabled, since in that case multisampling is same as not multisampled. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	f297e7de0f	st/nine: ATOC has effect only with ALPHATESTENABLE ATOC extension does something only when alpha test is enabled. Use a second bit to encode the difference with ATIATOC. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	edc5cdced5	st/nine: Add debug string for ATOC We were missing a debug string for this format. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	4e89dcf0c4	st/nine: Add asserts for output/input packing Nine doesn't support vs output/ps input packing. We haven't found any application requiring that, and implementing it properly is complex. Add asserts for now. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	aeddda0c3a	st/nine: Use correct PIPE_HANDLE_USAGE flag for frontbuffer copy When taking screenshots we do a copy from the frontbuffer to an allocated buffer (which we then copy to a ram buffer). Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	ca7c78a88e	st/nine: Fix output shift calculation We were getting it wrong for negative values. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	b8d95d4087	st/nine: Fix CheckDeviceFormat advertising for surfaces Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	6ef231c80f	st/nine: Improve buffer placement Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	7639033973	st/nine: Fix buffer bind flags Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	0f6e31823d	st/nine: Fix buffer locking flags handling Our behaviour was not entirely similar to what the docs and our tests describe. Drop d3dlock_buffer_to_pipe_transfer_usage. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	f45b9894e5	st/nine: Improve logging Add missing DBG calls in dtors. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	f3fa7e3068	st/nine: Use WINE thread for threadpool Use present interface 1.2 function ID3DPresent_CreateThread to create the thread for threadpool. Creating the thread with WINE prevents some rarely occuring crashes. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	72be473ad1	st/nine: Don't present if window is occluded The problem is that if one d3d present call fails, because of our occlusion check in present method, the next presentation call will send the same pixmap to the Xserver again, without waiting it is released, which is wrong. Move the present call after occlusion check to return and prevent Xpixmaps errors. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	c673c46ccf	st/nine: Use new function to query for resolution mismatch Any third party app might change the current screen resolution. Poll for resolution mismatch to force a device reset. Required for non ex devices only. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	dae9a91727	st/nine: Implement IPresent version 1.2 Implement presentation interface version 1.2: * ID3DPresent_ResolutionMismatch Poll for resolution mismatch. A third party app might have changed resolution, which requires a device reset. * ID3DPresent_CreateThread Create a thread in WINE to allow nine to use Windows API functions. Required for multi-threaded presentation. In single-threaded presentation mode the calling thread is already known to WINE. * ID3DPresent_WaitForThread Wait for a wine thread to terminate. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	2e149a2bf0	st/nine: Implement BumpEnvMap for ff Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	c4e85202cb	st/nine: Format conversion for volumes in UpdateTexture We were doing the conversion for surfaces, but not yet volumes. Now that volumes can do conversion, use it. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	23e2a235dc	st/nine: Remove one useless function output Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	10e548c0c9	st/nine: Add support for X8L8V8U8 X8L8V8U8 support should be common. Some more recent cards do support this format, but not L6V5U5. Add fallback for this format to have it alwaus supported. L6V5U5 conversion rule apparently differs a bit from the normal spec, and thus the gallium equivalent format leads to slightly wrong colors. Since some recent cards do not support it, do not support it either. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	258ca1823c	st/nine: Add format fallback with conversion to volumes Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	755fbcdf24	st/nine: Add format fallback with conversion to surfaces Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	52cb8e33c3	gallium/util: Implement util_format_translate_3d This is the equivalent of util_format_translate, but for volumes. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	89344a80fc	st/nine: Fix Pointsize in programmable shader Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	ae0fdd8a40	st/nine: Fix ff pointscale computation Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	c4af309973	st/nine: Fix header of GetIndices There is a mistake in the online documentation, the function only has 2 arguments. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	3e9d01ff39	st/nine: Increase minor d3dadapter9drm ABI Version 0.1 allows to assume that the second element of the IDirect3D* structures will be a pointer to the internal nine vtable. This is useful if the gallium nine user wants to wrap some interfaces. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	2d51c817cd	st/nine: Fix leak after ctor failures Previously ctor failures would not unreference the device. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	7fc8391d23	st/nine: Add ColorFill test for compressed textures ColorFill should contain alignment checks for compressed textures. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	d11d913987	st/nine: PositionT and Tessfactor are forbidden as PS input According to wine tests, they are forbidden as PS input, which makes sense. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	44068af92e	st/nine: Fix some shader failures not triggering error Some failures during shader translation would not raise errors before this patch. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	a77d8cd710	st/nine: Forbid POSITION0 for PS3.0 POSITION0 input is forbidden for PS3.0 apparently. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	217d969746	st/nine: Rework UpdateTexture Checks Our code did match the user documentation of the function quite well (except for format check). However the DDI documentation and wine tests show that documentation was not correct. Thus adapt our code to fit the best possible to the -real- spec. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	4c77673de7	st/nine: Use bufs instead of Flags for Clear bufs doesn't contain depthstencil if there is z buffer mismatch. This is the behaviour we want. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	f7c3d27d18	d3dadapter9: Add ddebug, rbug and trace support Add support for ddebug, rbug and trace Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	0ae3c8ece7	radeon: Change AA sample locations for EG+ This sets the AA location to the d3d11 spec. EG/NI 8X MSAA is left as is. Not sure why it was set different to Cayman, so lets it as is. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 23:37:14 +02:00
Axel Davy	11e4987135	radeonsi: Mixed colorbuffer formats are unsupported Besides depth/stencil, the hardware doesn't support mixed formats. The GL state tracker doesn't make use of them. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 23:37:14 +02:00
Axel Davy	fc3533c088	radeonsi: Change default behaviour for undefined COLOR0 d3d 9 needs COLOR0 to be 1.0 on all channels when undefined. 0.0 for the others is fine. GL behaviour is undefined. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 23:37:14 +02:00
Axel Davy	a221f40dbb	r600g: Change default behaviour for undefined COLOR0 d3d 9 needs COLOR0 to be 1.0 on all channels when undefined. 0.0 for the others is fine. GL behaviour is undefined. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 23:37:14 +02:00
Axel Davy	7e05e4c388	r600: Change default behaviour for undefined COLOR0 d3d 9 needs COLOR0 to be 1.0 on all channels when undefined. 0.0 for the others is fine. GL behaviour is undefined. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 23:37:14 +02:00
Christian Schmidbauer	f5d6ed5702	st/nine: Clean up WINAPI definition As Emil pointed out, only gcc, clang and MSVC compatibility is required. Hence the check for GNUC can be skipped, as __i386__ and __x86_64__ are only defined for gcc/clang, not for MSVC. Remove the #undef which has been there for historic reasons, when wine dlls for nine have been built inside mesa. Instead use #ifndef in order to avoid redefining WINAPI from MSVC's headers. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Brian Paul	243fd02858	svga: add another debug_printf() in svga_screen_create() Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-18 14:58:35 -06:00
Brian Paul	96909ef128	spirv: add switch case for nir_texop_txf_ms_mcs in vtn_handle_texture() Mark it as unreachable. Silences a compiler warning: spirv/spirv_to_nir.c:1397:4: warning: enumeration value 'nir_texop_txf_ms_mcs' not handled in switch [-Wswitch] switch (instr->op) { ^ Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-05-18 14:57:45 -06:00
Matt Turner	9c290b1e54	Revert "i965/urb: fixes division by zero" This reverts commit `2a8aa1e3de`.	2016-05-18 12:48:50 -07:00
Ardinartsev Nikita	2a8aa1e3de	i965/urb: fixes division by zero Fixes regression introduced by `af5ca43f26` Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419	2016-05-18 11:09:37 -07:00
Matt Turner	caab3cd536	mesa: fclose() filename on error. Pretty useless, as it's in debugging code. Found by Coverity (CID 1257016).	2016-05-18 11:09:37 -07:00
Matt Turner	cbb0e3a7e8	i965/fs: Assert that nir_op_extract_*'s src1 is a constant.	2016-05-18 11:09:37 -07:00
Matt Turner	6a4ff51f7a	glsl: Check that layout is non-null before dereferencing. layout should only be null for structs, but it's checked everywhere else and confuses Coverity (CID 1358495). Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 11:09:37 -07:00
Matt Turner	53f64a8404	egl/dri2: Don't check return result of mtx_unlock(). Coverity (CID 1358496) warns that the cleanup code doesn't unlock the mutex (which is arguably kind of stupid, since the only case that can happen is when mtx_unlock() failed!). But, mtx_unlock() isn't going to fail -- the mutex was locked by this thread just a few lines above it.	2016-05-18 11:09:37 -07:00
Matt Turner	b1e6d069da	spirv: Properly size the src[] array. Operations like nir_op_bitfield_insert have four arguments, and Coverity isn't privy to the fact that 4-argument operations aren't possible here, so it thinks this can lead to memory corruption. Just increase the size of the array to quell any fears.	2016-05-18 11:09:37 -07:00
Matt Turner	0a548eb56f	isl: Mark default cases in switch unreachable. To silence -Wmaybe-uninitialized warnings.	2016-05-18 11:09:37 -07:00
Ian Romanick	7619aed41d	glsl/linker: Ensure the first stage of an SSO pipeline has input locs assigned Previously an SSO pipeline containing only a tessellation control shader and a tessellation evaluation shader would not get locations assigned for the TCS inputs. This would lead to assertion failures in some piglit tests, such as arb_program_interface_query-resource-query. That piglit test still fails on some tessellation related subtests. Specifically, these subtests fail: 'GL_PROGRAM_INPUT(tcs) active resources' expected 2 but got 3 'GL_PROGRAM_INPUT(tcs) max length name' expected 12 but got 16 'GL_PROGRAM_INPUT(tcs,tes) active resources' expected 2 but got 3 'GL_PROGRAM_INPUT(tcs,tes) max length name' expected 12 but got 16 'GL_PROGRAM_OUTPUT(tcs) active resources' expected 15 but got 3 'GL_PROGRAM_OUTPUT(tcs) max length name' expected 23 but got 12 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-05-18 10:53:50 -07:00
Ian Romanick	79bbff9def	glsl/linker: Don't include interface name for built-in blocks Commit `11096ec` introduced a regression in some piglit tests (e.g., arb_program_interface_query-resource-query). I did not notice this regression because other (unrelated) problems caused failed assertions in those same tests on my system... so they crashed before getting to the new failure. v2: Use is_gl_identifier. Suggested by Tim. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-05-18 10:53:34 -07:00
Ian Romanick	2ef4b5bc93	glsl: Assert that inputs have a location assigned This catches a problem previously undetected until deep in the backend. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Ian Romanick	cf9220b11f	glsl/linker: Fix trivial typos in comments Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Ian Romanick	d2579728c9	glsl/linker: Fix some formatting to match current coding conventions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Ian Romanick	02e4753777	glsl/linker: Silence unused parameter warning The use of the parameter was removed in `d6b92028`. glsl/link_varyings.cpp:1390:39: warning: unused parameter ‘separate_shader’ [-Wunused-parameter] bool separate_shader) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Ian Romanick	75c9aa6670	glsl/linker: Silence unused parameter warning The parameter appears to have been unused since the function was added in commit `12ba6cfb`. Remove it. glsl/linker.cpp:2886:60: warning: unused parameter ‘prog’ [-Wunused-parameter] match_explicit_outputs_to_inputs(struct gl_shader_program *prog, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Ian Romanick	f687b8e178	i965: Silence unused parameter warnings The only place that actually used the type parameter was the GS visitor, and it was always passed glsl_type::int. Just remove the parameter. brw_vec4_vs_visitor.cpp:38:61: warning: unused parameter ‘type’ [-Wunused-parameter] const glsl_type *type) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Daniel Scharrer	1d628ea09d	mesa: Don't advertise GLES 3.1 without compute support The MaxComputeWorkGroupInvocations constant is used in compute_version_es2() instead of extensions->ARB_compute_shader as ES has lower requirements than desktop GL. Both i965 and gallium set this constant before enabling compute support. Signed-off-by: Daniel Scharrer <daniel@constexpr.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 18:21:21 +02:00
Rob Clark	5827a1dc4b	mesa/st: don't leak name Pointed out by coverity. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-18 09:20:22 -04:00
Brian Paul	877a8026c7	svga: null out all sampler views if start=num=0 Because the CSO module handles sampler views for fragment shaders differently than vertex/geom shaders, VS/GS shader sampler views aren't explicitly unbound like for FS sampler vers. This code checks for the case of start=num=0 and nulls out the sampler views. Fixes a assert regression in piglit's arb_texture_multisample- sample-position test. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-17 19:20:36 -06:00
Brian Paul	fe430b0310	st/mesa: remove unused st_context::default_texture The code which used this was removed quite a while ago. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-17 19:20:36 -06:00
Brian Paul	5888c47cc9	cso: remove / add some comments Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-17 19:20:36 -06:00
Eric Anholt	18260d0582	vc4: Add support for vertex color clamping in the rasterizer. This gets us precompile of vertex shaders at the state tracker level as well.	2016-05-17 18:09:58 -07:00
Eric Anholt	474e2bbcc1	vc4: Move tgsi_to_nir to precompile time. Now we have an immutable nir shader in our shader's CSO that we can clone and lower/optimize.	2016-05-17 18:07:39 -07:00
Eric Anholt	734fe41092	vc4: Mark the driver as supporting fragment color clamping in rast. We always clamp fragment colors, since they're always 8-bit unorm, so there's no need to have us compile separate shaders based on GL_ARB_color_buffer_float. This gives us precompilation of fragment programs to the vc4_shader_state_create() level.	2016-05-17 18:07:39 -07:00
Eric Anholt	8835eb689b	vc4: Enable sharing shaders across contexts. This allows the same pipe_shader_state to be referenced from multiple contexts. Since our pipe_shader_state is treated as immutable (other than the variant number) within the driver, this is no problem.	2016-05-17 18:07:39 -07:00
Eric Anholt	62087cb9b8	vc4: Switch to using nir_load_front_face. This will be generated by glsl_to_nir, and it turns out that this is a more code-efficient path than the floating point math, anyway. No change on shader-db, but drops an instruction in piglit's glsl-fs-frontfacing.	2016-05-17 18:07:39 -07:00
Eric Anholt	0700e4c0c7	vc4: Drop the dead export_linkage array. This came from deriving from freedreno.	2016-05-17 18:07:39 -07:00
Eric Anholt	24e7e3d3fc	vc4: Fix a -Wformat-security warning. This is apparently enabled as an error in Android builds, and the compiler can't tell that the return value is safe.	2016-05-17 18:07:39 -07:00
Alex Deucher	86f51d7958	radeonsi: add new polaris11 pci ids Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-17 17:49:50 -04:00
Alex Deucher	768320b497	radeonsi: add new polaris10 pci ids Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-17 17:49:50 -04:00
Kenneth Graunke	dc657a8201	i965: Make brw_reg_from_fs_reg() halve exec_size when compressed. In `a5d7e144ea`, Connor generalized the exec_size halving code to handle more cases. As part of this, he made it not halve anything if the region accessed falls completely in a single register. Unfortunately, it started producing some invalid regions: -add(16) g6<1>F g10<8,8,1>UW -g1<0,1,0>F { align1 compr }; -add(16) g8<1>F g12<8,8,1>UW -g1.1<0,1,0>F { align1 compr }; +add(16) g6<1>F g10<16,16,1>UW -g1<0,1,0>F { align1 compr }; +add(16) g8<1>F g12<16,16,1>UW -g1.1<0,1,0>F { align1 compr }; Here, the UW source region completely fits within a register. However, we have to use instruction compression because the destination region spans two registers. <16,16,1> is invalid because it's compressed. To handle this, skip the "everything fits in one register" case and fall through to the exec_size halving case when compressed. Fixes hundreds of Piglit regressions on GM965. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95370 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-17 14:40:37 -07:00
Kenneth Graunke	062ad81669	i965: Move compression decisions before brw_reg_from_fs_reg(). brw_reg_from_fs_reg() needs to know whether the instruction will be compressed or not. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95370 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-17 14:40:31 -07:00
Kenneth Graunke	9a1936d965	i965: Enable ES 3.2 sample shading extensions. This enables: - GL_OES_sample_shading - GL_OES_sample_variables - GL_OES_shader_multisample_interpolation On Gen8, we pass all the CTS tests, and all but 4 of the dEQP-GLES31 tests (dealing with 1x/2x MSAA at half rate sampling). We believe those 4 dEQP-GLES31 tests are incorrect. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-17 14:27:29 -07:00
Jordan Justen	1ff212bfd3	anv: Fix warning: unused variable ‘cs_prog_data’ This was introduced in `8a80af2820`. Reported-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-17 14:09:56 -07:00
Mauro Rossi	0e81336550	android: fix building error in libmesa_st_mesa Fixes the following building error due to libmesa_nir dependency: In file included from external/mesa/src/mesa/state_tracker/st_glsl_to_nir.cpp:44:0: external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory #include "nir_opcodes.h" ^ compilation terminated. build/core/binary.mk:706: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.o' failed make: * [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.o] Error 1 make: * Waiting for unfinished jobs.... Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-17 17:07:28 -04:00
Nicolai Hähnle	941756f092	radeonsi: force level zero on image instructions in non-fragment shaders (v2) Section 8.9 (Texture Functions) of the OpenGL Shading Language 4.5 specification: However, automatic level of detail is computed only for fragment shaders. Other shaders operate as though the base level of detail were computed as zero. and Section 8.9.3 (Texture Gather Functions): When performing a texture gather operation, the minification and magnification filters are ignored, and the rules for LINEAR filtering in the OpenGL Specification are applied to the base level of the texture image to identify the four texels i_0 j_1, i_1 j_1, i_1 j_0, and i_0 j_0. Of course, explicit LOD or derivative variants work in all shader types. This fixes several GL4x-CTS.texture_gather.* tests. v2: TG4 is always level zero (thanks, Ilia) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:40 -05:00
Nicolai Hähnle	988fd6c922	radeonsi: emit TXQ in separate functions TXQ is sufficiently different that having in it in the same code path as texture sampling/fetching opcodes doesn't make much sense. v2: guard against NULL pointer dereferences Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-05-17 15:28:40 -05:00
Nicolai Hähnle	d464bfd12a	winsys/amdgpu: cleanup error handling in amdgpu_ctx_create Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:40 -05:00
Nicolai Hähnle	fef08af99c	winsys/amdgpu: avoid ioctl call when fence_wait is called without timeout When user fences are used, we don't need the kernel for polling. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:39 -05:00
Nicolai Hähnle	0558564200	gallium/radeon: add radeon_emitted to check for non-trivial IBs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:39 -05:00
Nicolai Hähnle	5e89b027b9	gallium/radeon: use radeon_emit_array Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:39 -05:00
Nicolai Hähnle	c23273532e	gallium/radeon: use radeon_emit Mostly generated using a sed-script, with manual fix-up for multi-line statements. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:38 -05:00
Nicolai Hähnle	4ac555e9e5	st/mesa: fix reversed copyimage canonical format The format_desc swizzle describes where in the array each color channel comes from - but the existing code was written as if each entry in the swizzle described the meaning of an array element. Fixes piglit's arb_copy_image-format-swizzle. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:38 -05:00
Jordan Justen	6c9f35bb73	Revert "HACK: Don't re-configure L3$ in render stages pre-BDW" This reverts commit `41af9b2e51`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94468 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-17 13:04:03 -07:00
Jordan Justen	8a80af2820	anv: Port L3 cache programming from i965 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-17 13:04:03 -07:00
Jordan Justen	aa41de080d	anv/gen7: Add memory barrier to vkCmdWaitEvents call We also have this barrier call for gen8 vkCmdWaitEvents. We don't implement waiting on events for gen7 yet, but this barrier at least helps to not regress CTS cases when data caching is enabled. Without this, the tests would intermittently report a failure when the data cache was enabled. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-17 13:04:03 -07:00
Jordan Justen	8ee31828c6	anv: Keep track of whether the data cache should be enabled in L3 If images or shader buffers are used, we will enable the data cache in the the L3 config. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-17 13:04:03 -07:00
Jordan Justen	ff41738871	genxml/hsw: Add L3 cache control registers These were added to the i965 driver in `5912da45a6`. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-17 13:04:03 -07:00
Jan Vesely	47b390fe45	Treewide: Remove Elements() macro Signed-off-by: Jan Vesely <jano.vesely@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-17 15:28:04 -04:00
Jan Vesely	322cd2457c	r600g,sb: Don't use standard macro name Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-05-17 15:28:03 -04:00
Jason Ekstrand	b6c4d46a58	anv/formats: Add support for VK_FORMAT_B4G4R4A4_UNORM pre-gen8	2016-05-17 12:17:22 -07:00
Jason Ekstrand	45c93384e5	anv: Add a devinfo argument to the get_format functions	2016-05-17 12:17:22 -07:00
Jason Ekstrand	100db3d31c	anv/formats: Set the swizzle to RGB1 when using an RGBA format to fake RGB This way we get correct sampling from RGB formats that are faked as RGBA. This should also cause it to disable rendering and blending on those formats. We should be able to render to them and, on Broadwell and above, we can blend on them with work-arounds. However, we'll add support for that more properly later when it's deemed useful. For now, disabling rendering and blending should be safe.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	ce375fba41	anv/formats: Refactor anv_get_format The new code removes the switch statement and instead handles depth/stencil as up-front special cases. This allows for potentially more complicated color format handling in the future.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	34198d798c	anv: Use 16 bits for the isl_format in anv_format This way the entire anv_format structure fits in 32 bits	2016-05-17 12:17:22 -07:00
Jason Ekstrand	7cae59012d	anv/formats: Use the isl_channel_select enum for the swizzle	2016-05-17 12:17:22 -07:00
Jason Ekstrand	8ed429a4f0	anv/formats: Add an anv_get_format helper This commit removes anv_format_for_vk_format and adds an anv_get_format helper. The anv_get_format helper returns the anv_format by-value. Unlike anv_format_for_vk_format the format returned by anv_get_format is 100% accurate and includes any tweaks needed for tiled vs. linear. anv_get_isl_format is now just a wrapper around anv_get_format that picks off just the isl_format.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	13f5cee663	anv/format: Simplify anv_format Now that we have VkFormat introspection and we've removed everything that tried to use anv_format for introspection, we no longer need most of what was in anv_format.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	c1c004e5b2	anv/formats: Delete validate_GetPhysicalDeviceFormatProperties All it ever did was some extra logging that was useful when initially bringing up Dota2. We don't need it anymore.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	aad56f3ee7	anv/image: Use aspects for computing full usage	2016-05-17 12:17:22 -07:00
Jason Ekstrand	fbc23d93e0	anv: Remove the anv_format member from anv_image	2016-05-17 12:17:22 -07:00
Jason Ekstrand	be94a23b44	anv/wsi: Use vk_format_info for asserts rather than anv_format	2016-05-17 12:17:22 -07:00
Jason Ekstrand	63dbb2c60a	anv/copy: Use the linear format from the image for the buffer block size Because the buffer is exposed to the user, the block size is defined to always exactly be the size of the actual vulkan format. This is the same size (it had better be) as the linaer image format.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	c87429c5f1	anv/image: Stop using anv_format for image create validation	2016-05-17 12:17:22 -07:00
Jason Ekstrand	990a7420b6	anv/image: Make heavier use of aspects	2016-05-17 12:17:22 -07:00
Jason Ekstrand	369b8bf402	anv/copy: Use the color_surf from the image to get the block size	2016-05-17 12:17:22 -07:00
Jason Ekstrand	9102e88364	anv: Change render_pass_attachment.format to a VkFormat	2016-05-17 12:17:22 -07:00
Jason Ekstrand	ffc502ce0c	anv: Add helpers to provide simple VkFormat introspection As much as I hate adding yet more format introspection, there are times when the VkFormat is sufficient and we don't want to round-trip through isl_format. For these times, the new vk_format_info.c/h files provide some simple driver-agnostic VkFormat introspection. This intended to be specific to Vulkan but not to any driver whatsoever.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	97ba402cc3	anv/image: Use get_isl_format when creating buffer views	2016-05-17 12:17:22 -07:00
Jason Ekstrand	234ecf26c6	anv/image: Add an aspects field This makes several checks easier and allows us to avoid calling anv_format_for_vk_format in a number of cases.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	1bda8d06e5	anv: Make format_for_descriptor return an isl_format	2016-05-17 12:17:22 -07:00
Jason Ekstrand	263a8cb52d	anv/wayland: Don't allow non-renderable formats	2016-05-17 12:17:22 -07:00
Jason Ekstrand	eb6baa3174	anv/wsi: Make WSI per-physical-device rather than per-instance This better maps to the Vulkan object model and also allows WSI to at least know the hardware generation which is useful for format checks.	2016-05-17 12:17:22 -07:00
Adam Jackson	2ad9d6237a	glapi/gen: Copy some GL 1.0 enum details into ARB_viewport_array Otherwise the instances in the extension XML override the core definitions, and we stop knowing their sizes in indirect_size_get.c Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Adam Jackson	f4983b194d	glapi: Define PURE for Sun Studio as well Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Adam Jackson	f1dd8dd6b6	glapi/glx: Mark byteswap functions as _X_UNUSED (v2) Squashes the one remaining warning in the xserver build. v2: Also clean up some non-standard whitespace (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Adam Jackson	ea08a5bcf6	glapi: Harden GLX request size processing (v2) v2: Use == not is for equality testing (Dylan Baker) Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Adam Jackson	88cfc9ddaa	glapi: Add the safe_{add,mul,pad} functions from xserver We're about to update the generator scripts to use these, easier not to vary between client and server. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Adam Jackson	7bc5c7f586	glapi: Fix whitespace droppings when printing the license header Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Rob Clark	1e93b0caa1	mesa/st: add support for NIR as possible driver IR Signed-off-by: Rob Clark <robclark@freedesktop.org> Acked-by: Eric Anholt <eric@anholt.net>	2016-05-17 14:22:46 -04:00
Rob Clark	2bbb140be3	mesa/st: move things around a bit in st_create_fp_variant() Prep work for next patch. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 14:22:46 -04:00
Rob Clark	8f9a46dccb	mesa/st: add nir pass for lowering builtin uniforms Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-17 14:22:46 -04:00
Emil Velikov	52addd90d1	scons: gallium: link against nir as needed ... otherwise we'll produce uncomplete binaries with introduction of NIR as alternative IR with next commits. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-17 14:22:46 -04:00
Jason Ekstrand	265487aedf	i965/fs: Add an allow_spilling flag to brw_compile_fs This allows us to disable spilling for blorp shaders since blorp state setup doesn't handle spilling. Without this, blorp fails hard if you run with INTEL_DEBUG=spill. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Francisco Jerez <currojerez@riseup.net>	2016-05-17 10:20:11 -07:00
Ilia Mirkin	dd4b44efc0	nvc0/ir: fix shared atomic lowering to preserve shared memory location We were always doing atomics on shared memory location 0 instead of the originally supplied location. Make sure to pass through the original symbol and any indirection. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org # note: expect minor conflict	2016-05-17 11:22:01 -04:00
Rob Clark	b65bd3dee5	freedreno/ir3: fix compiler warning Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-17 10:05:20 -04:00
Rob Clark	e8beffb1b3	nir/validate: dump annotated shader with error msgs Log all the errors, and at the end dump the shader w/ error annotations to make it easier to see where the problems are. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-17 10:05:20 -04:00
Rob Clark	54ecfcc162	nir/validate: assert() -> validate_assert() Prep work for next patch. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-17 10:05:20 -04:00
Rob Clark	a0ef26c1c2	nir/print: add support for print annotations Caller can pass a hashtable mapping NIR object (currently instr or var, but I guess others could be added as needed) to annotation msg to print inline with the shader dump. As the annotation msg is printed, it is removed from the hashtable to give the caller a way to know about any unassociated msgs. This is used in the next patch, for nir_validate to try to associate error msgs to nir_print dump. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-17 10:05:20 -04:00
Alejandro Piñeiro	e5e412cd27	i965: Expose OpenGL 4.2 for gen8+ ARB_vertex_attrib_64bit was the only feature missing. v2: we can expose 4.2 instead of 4.1 (Ian Romanick) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Alejandro Piñeiro	f051eae25a	docs: Mark ARB_vertex_attrib_64bit as done for i965/gen8+ v2: label as done for i965/gen8+ instead of i965 (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Alejandro Piñeiro	59b5441fd9	i965: Enable ARB_vertex_attrib_64bit for gen8+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero	d6281a9d95	i965: take care of doubles when lowering VS inputs Input attributes can require 2 vec4 or 1 vec4 depending on whether they are double-precision or not. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero	7ea09511ca	i965/fs: calculate first non-payload GRF using attrib slots When computing where the first non-payload GRF starts, we can't rely on the number of attributes, as each attribute can be using 1 or 2 slots depending on whether they are a dvec3/4 or other. Instead, we need to use the number of slots used by the attributes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero	b7423b485e	i965/vec4: use attribute slots to calculate URB read length Do not use total attributes because a dvec3/dvec4 attribute requires two slots. So rather use total attribute slots. v2: do not use loop to calculate required attribute slots (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero	b0fb08e179	i965: take care of doubles when remapping VS attributes Double-precision types require 1 slot in VUE for double and dvec2, and 2 slots for anything else. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:54 +02:00
Juan A. Suarez Romero	80535873bb	nir: add double input bitmap This bitmap tracks which input attributes are double-precision. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:54 +02:00
Juan A. Suarez Romero	ccfe25f758	i965/fs: shuffle 32bits into 64bits for doubles VS Thread Payload handles attributes in URB as vec4, no matter if they are actually single or double precision. So with double-precision types, value ends up in the registers split in 32bits chunks, in different positions. We need to shuffle the chunks to get the doubles correctly. v2: * Extra blank line. Add { } on if body (Ian Romanick) * Use dest directly (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:47 +02:00
Alejandro Piñeiro	96c276dda9	i965/fs: half exec_size when dealing with 64 bits attributes The HW has a restriction that only vertical stride may cross register boundaries. Until now this was only handled on VGRFs at rw_reg_from_fs_reg, but it is also needed for attributes. v2: * Remove reference to commit id on commit message (Juan Suarez) * Simplify code that compute final exec_size (Ian Romanick) * Use REG_SIZE on that same code (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 07:34:40 +02:00
Alejandro Piñeiro	1ff32ae8b2	i965: passthru formats cannot be used width edge flag enabled Add an assertion to detect this case. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 07:34:40 +02:00
Antia Puentes	8b0a334b5e	i965: Configure how to store 64PASSTHRU vertex components From the Broadwell specification, structure VERTEX_ELEMENT_STATE description: "When SourceElementFormat is set to one of the 64_PASSTHRU formats, 64-bit components are stored in the URB without any conversion. In this case, vertex elements must be written as 128 or 256 bits, with VFCOMP_STORE_0 being used to pad the output as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red component into the URB, Component 1 must be specified as VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128-bit vertex element, or Components 1-3 must be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex element. Likewise, use of R64G64B64_PASSTHRU requires Component 3 to be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex element." Uses 128-bits to write double and dvec2 vertex elements, and 256-bits for dvec3 and dvec4 vertex elements. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Signed-off-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 07:34:40 +02:00
Alejandro Piñeiro	71150b73c8	i965: get the proper vertex surface type for doubles on gen8+ This commit adds support for PASSTHRU format when pushing double-precision attributes. Check glarray->Doubles in order to know if we should choose a format that does a conversion to float, or just passthru the 64-bit double. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 07:34:40 +02:00
Ilia Mirkin	b1d74e9486	nvc0/ir: make sure out-of-bounds buffer loads/atomics get a 0 result Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-17 01:27:29 -04:00
Timothy Arceri	4fb4fd0b6b	glsl: make reserved_varying_slot() static Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-17 15:06:39 +10:00
Timothy Arceri	1d752823af	glsl: include per-patch varyings when generating reserved slot bitfield Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-17 15:06:27 +10:00
Timothy Arceri	00441829e7	glsl: don't incorrectly eliminate patches with explicit locations These varying have a separate location domain from per-vertex varyings and need to be handled separately. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-17 15:06:21 +10:00
Timothy Arceri	3f477f0ea5	glsl: remove remainings tabs in link_varyings.cpp Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-17 15:06:16 +10:00
Timothy Arceri	6d5f7557fb	glsl: fix location and component packing validation on patches These varyings have a separate location domain from per-vertex varyings and need to be handled separately. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-17 15:06:12 +10:00
Kenneth Graunke	aae0865dc0	i965: Enable ARB_shader_precision on Gen8+. I recently fixed a bug in the Piglit tests: https://lists.freedesktop.org/archives/piglit/2016-May/019802.html With that patch in place, we pass all the tests. So, turn it on. We could probably expose this earlier than Gen8, but the extension says that OpenGL 4.0 is required, and all of our tests are written against GLSL 4.00 (which is only supported on Gen8+). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 17:52:45 -07:00
Jose Fonseca	cf010de6ee	vl/dri: Move the DRI3 check out of sources include into C. Fixes SCons build. Trivial. Built locally with SCons and autotools.	2016-05-16 21:50:43 +01:00
Leo Liu	5e2072c711	st/vdpau: add dri3 support Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	c122c74dca	vl/dri3: implement functions for get and set timestamp Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	9f50a79b8f	vl/dri3: handle PresentCompleteNotify event and get timestamp calculated based on the event's reply Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	e8282178ab	st/va: add dri3 support Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	8d7ac0a4e4	vl/dri3: implement DRI3 BufferFromPixmap We also need render to the front buffer of temporary X pixmap, this is the case of when we using opengl as video out for vaapi. the basic implementation is to pass pixmap ID to X server, and then X will return dma-buf fd, we will get the buffer object through this dma-buf fd. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	858b329c2c	vl/dri3: add support for resizing When drawable size changed, PresentConfigureNotify event will be emitted, by handling the event to re-allocate resized buffer. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	96580ad593	vl/dri3: implement funciton for get dirty area This will clear presentation area not covered by video content Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	b0bd908284	vl/dri3: implement function for flush frontbuffer Request drawable content in pixmap by calling DRI3 PresentPixmap, and handle PresentIdleNotify event. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	e1223282db	vl/dri3: add back buffers support This implements DRI3 PixmapFromBuffer. Create buffer objects, and associate it to a dma-buf fd, and then pass this fd with a pixmap ID to X server for creating pixmap object; also add a function for wait events. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	69ba9be4d2	vl/dri3: implement flushing for queued events also place holder for present events handling Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	758b1bbaa7	vl/dri3: register present events Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	672e8d5e7e	vl/dri3: set drawable geometry Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	12e5220e34	vl/dri3: add DRI3 support and implement create and destroy Required functions into place for implementation, create screen with device fd returned from X server, also bail out to DRI2 with certain conditions. v2: -organize the error out path (Axel) -squash previous patch 1 and 2 into one (Emil) Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Dave Airlie	30e437bd76	mesa/version.c: enable cull distance in version check. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-17 06:08:31 +10:00
Ian Romanick	11096ecc39	glsl/linker: Include the interface name for input and output blocks On my oes_shader_io_blocks branch, this fixes 71 dEQP-GLES31.functional.program_interface_query.* tests. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2016-05-16 11:18:03 -07:00
Ian Romanick	7c11589eb4	glsl/linker: Use canonical format for ARB_program_interface_query spec quotes Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 11:18:03 -07:00
Mark Janes	fd854c1add	i965: check tcs for NULL dereference Coverity issue 1361544 found an instance where the tcs variable is checked for NULL, but unconditionally dereferenced later in the same function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 11:11:11 -07:00
Matt Turner	bf91034d44	i965: Mark is_lossless_compressed_aux UNUSED to silence warning. Used only in assert().	2016-05-16 11:08:55 -07:00
Matt Turner	1385018a72	genxml: Use llroundf() and store to appropriate type. Both functions return uint64_t, so I expect the masking/shifting should be done on 64-bit types. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-16 11:06:15 -07:00
Matt Turner	4191551262	nir: Mark nir_start_block()/nir_impl_last_block() with returns_nonnull. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 11:06:15 -07:00
Matt Turner	377ab2f2d7	util: Add ATTRIBUTE_RETURNS_NONNULL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 11:06:15 -07:00
Jan Vesely	40c6d54e76	clover: grid_offset should be padded with 0 not 1 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 13:58:14 -04:00
Iago Toral Quiroga	71465179fc	i965: Expose OpenGL 4.0 for gen8+ ARB_gpu_shader_fp64 was the only feature missing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:34 +02:00
Iago Toral Quiroga	b1d21e1159	docs: Mark ARB_gpu_shader_fp64 as done for i965/gen8+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	309d285c6b	i965: Enable ARB_gpu_shader_fp64 for gen8+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	58f304defe	i965/tes/scalar: Fix load input for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	61197b8d5d	i965/tcs/scalar: fix store output for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	cda3435ea8	i965/tcs/scalar: fix load input for doubles v2: do not write to the original indirect_offset since that is an expression that could be used somewhere else (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	66192b3c16	i965/fs: fix nir_intrinsic_store_output for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	3cce67aff0	i965/fs: fix number of output components for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	0297f1021a	i965/vec4: handle doubles in type_size_vec4() The scalar backend uses this to check URB input sizes. v2: Removed redundant break after return (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	8c6d147373	i965/fs: support doubles with shared variable stores This is pretty much the same we do with SSBOs. v2: do not shuffle in-place, it is not safe since the original 64-bit data could be used after the write, instead use a temporary like we do for SSBO stores (Iago) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	943f9442bf	i965/fs: support doubles with ssbo stores Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	b9aa66aa51	i965/fs: add shuffle_64bit_data_for_32bit_write helper This does the inverse operation of shuffle_32bit_load_result_to_64bit_data and we will use it when we need to write 64-bit data in the layout expected by untyped write messages. v2 (curro): - Use subscript() instead of stride() - Assert on the input types rather than silently retyping. - Use offset() instead of horiz_offset(), drop the multiplier definition. - Drop the temporary vgrf and force_writemask_all. - Make component_i const. - Move to brw_fs_nir.cpp v3 (curro): - Pass dst and src by reference. - Simplify allocation of tmp register. - Move to brw_fs_nir.cpp. - Get rid of the temporary. v3 (Iago): - Check that the src and dst regions do not overlap, since that would typically be a bug in the caller. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	33f7ec18ac	i965/fs: support doubles with SSBO loads Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	8aa01ac596	i965/fs: support doubles with shared variable loads Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	6eab06b866	i965/fs: Add do_untyped_vector_read helper We are going to need the same logic for anything that reads doubles via untyped messages (CS shared variables and SSBOs). Add a helper function with that logic so that we can reuse it. v2: - Make this a static function instead of a method of fs_visitor (Iago) - We only support types with a size of 4 or 8 (Curro) - Avoid retypes by using a separate vgrf for the packed result (Curro) - Put dst parameter before source parameters (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	b86d4780ed	i965/fs: support doubles with UBO loads UBO loads with constant offset use the UNIFORM_PULL_CONSTANT_LOAD instruction, which reads 16 bytes (a vec4) of data from memory. For dvec types this only provides components x and y. Thus, if we are reading more than 2 components we need to issue a second load at offset+16 to read the next 16-byte chunk with components w and z. UBO loads with non-constant offset emit a load for each component in the vector (and rely in CSE to fix redundant loads), so we only need to consider the size of the data type when computing the offset of each element in a vector. v2 (Sam): - Adapt the code to use component() (Curro). v3 (Sam): - Use type_sz(dest.type) in VARYING_PULL_CONSTANT_LOAD() call (Curro). - Add asserts to ensure std140 vector alignment rules are followed (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	58f1804c4f	i965/fs: fix pull constant load component selection for doubles UNIFORM_PULL_CONSTANT_LOAD is used to load a contiguous vec4 starting at a constant offset that is 16-byte aligned. If we need to access an unaligned offset we emit a load with an aligned offset and use the remaining constant offset to select the component into the vec4 result that we are interested in. This component must be computed in units of the type size, since that is what fs_reg::set_smear expects. This patch does this change in the two places where we use this message: In demote_pull_constants when we lower uniform access with constant offset into the pull constant buffer and in UBO loads with constant offset. v2 (Sam): - Fix set_smear() in fs_visitor::lower_constant_loads(), take into account source type instead and remove MAX2 (Curro). - Improve changes to nir_intrinsic_load_ubo case in nir_emit_intrinsic() (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Francisco Jerez	71fd4942d1	i965/fs: Fix and document component(). This fixes a number of bugs of component() by reimplementing it in terms of horiz_offset(): Handling of base registers starting at a non-zero subreg_offset, handling of strided registers and overflow of subreg_offset into reg_offset. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	e209134f71	i965/fs: Fix fs_visitor::VARYING_PULL_CONSTANT_LOAD for doubles v2 (Curro): - Assert on scale == 1 when shuffling 64-bit data. - Remove type_slots, use type_sz(vec4_result.type) instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	50b7676dc4	i965/fs: add shuffle_32bit_load_result_to_64bit_data helper There will be a few places where we need to shuffle the result of a 32-bit load into valid 64-bit data, so extract this logic into a separate helper that we can reuse. v2 (Curro): - Use subscript() instead of stride() - Assert on the input types rather than retyping. - Use offset() instead of horiz_offset(), drop the multiplier definition. - Don't use force_writemask_all. - Mark component_i as const. - Make the function name lower case. v3 (Curro): - Pass src and dst by reference. - Move to brw_fs_nir.cpp Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Francisco Jerez	4d9c461e53	i965/fs: Stop using the LOAD_PAYLOAD instruction in lower_simd_width. Instead of using the LOAD_PAYLOAD instruction (emitted through the emit_transpose() helper that is no longer useful and this commit removes) which had to be marked force_writemask_all in some cases, emit a series of moves to apply proper channel enable signals to the destination. Until now lower_simd_width() had mainly been used to lower things that invariably had a basic block-local temporary as destination so it didn't seem like a big deal, but I found it to be the reason for several Piglit regressions in my SIMD32 branch and Igalia discovered the same issue independently while working on FP64 support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	9149fd6817	i965/fs: fix copy/constant propagation regioning checks We were not accounting for subreg_offset in the check for the start of the region. Also, fs_reg::regs_read() already takes the stride into account, so we should not multiply its result by the stride again. This was making copy-propagation fail to copy-propagate cases that would otherwise be safe to copy-propagate. Again, this was observed in fp64 code, since there we use stride > 1 often. v2 (Sam): - Rename function and add comment (Jason, Curro). - Assert that register files and number are the same (Jason). - Fix code to take into account the assumption that src.subreg_offset is strictly less than the reg_offset unit (Curro). - Don't pass the registers by value to the function, use 'const fs_reg &' instead (Curro). - Remove obsolete comment in the commit log (Curro). v3 (Sam): - Remove the assert and put the condition in the return (Curro). - Fix function name (Curro). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	789eecdb79	i965/fs: fix copy propagation from load payload We were not considering the case where the load payload is writing to a destination with a reg_offset > 0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	cf375a3333	i965/fs: fix copy propagation of partially invalidated entries We were not invalidating entries with a src that reads more than one register when we find writes that overwrite any register read by entry->src after the first. This leads to incorrect copy propagation because we re-use entries from the ACP that have been partially invalidated. Same thing for entries with a dst that writes to more than one register. v2 (Sam): - Improve code by defining regions_overlap() and using it instead of a loop (Curro). Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Francisco Jerez	ea1ef49a16	i965/fs: Reindent register offset calculation of try_copy_propagate(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Francisco Jerez	0fb19806c0	i965/fs: Simplify and fix register offset calculation of try_copy_propagate(). try_copy_propagate() was special-casing UNIFORM registers (the BAD_FILE, ARF and FIXED_GRF cases are dead, see the assertion at the top of the function) and then failing to take into account the possibility of the instruction reading from a non-zero offset of the destination of the copy. The VGRF/ATTR handling takes it into account correctly, and there is no reason we couldn't use the exact same logic for the UNIFORM file aside from the fact that uniforms represent reg_offset in different units. We can work around that easily by defining an additional constant with the right unit reg_offset is expressed in. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	7aa53cd725	i965/fs: disallow type change in copy-propagation if types have different sizes Because the semantics of source modifiers are type-dependent, the type of the original source of the copy must be kept unmodified while propagating it into some instruction, which implies that we need to have the guarantee that the meaning of the instruction is going to remain the same after we have changed the types. Whenthe size of the new type is different from the size of the old type the new and old instructions cannot possibly be equivalent because the new instruction will be reading more data than the old one was. Prevents that we turn this: load_payload(8) vgrf17:DF, \|vgrf4+0.0\|:DF 1sthalf mov(8) vgrf18:DF, vgrf17:DF 1sthalf load_payload(8) vgrf5:DF, vgrf18:DF, vgrf20:DF NoMask 1sthalf WE_all load_payload(8) vgrf21:UD, vgrf5+0.4<2>:UD 1sthalf mov(8) vgrf22:UD, vgrf21:UD 1sthalf into: load_payload(8) vgrf17:DF, \|vgrf4+0.0\|:DF 1sthalf mov(8) vgrf18:DF, \|vgrf4+0.0\|:DF 1sthalf load_payload(8) vgrf5:DF, \|vgrf4+0.0\|:DF, \|vgrf4+2.0\|:DF NoMask 1sthalf WE_all load_payload(8) vgrf21:UD, vgrf5+0.4<2>:UD 1sthalf mov(8) vgrf22:DF, \|vgrf4+0.4\|<2>:DF 1sthalf where the semantics of the last instruccion have changed. v2 (Curro): - Update commit log and add comment to explain the problem better. - Simplify the condition. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	ac9b966aac	i965/fs: Fix copy propagation of load payload for double operands Specifically, consider the size of the data type of the operand to compute the number of registers written. v2 (Sam): - Fix line width (Jordan). - Add an assert (Jordan). - Use REG_SIZE in the calculation of regs_written (Curro) v3 (Sam): - Fix assert and calculation of regs_written (Curro). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:32 +02:00
Francisco Jerez	70dc19f9d6	i965/fs: Fix propagation of copies with strided source. This has likely been broken since we started propagating copies not matching the offset of the instruction exactly (`1728e74957`). The copy source stride needs to be taken into account to find out the offset at the origin that corresponds to the offset at the destination of the copy which is being read by the instruction. This has led to program miscompilation on both my SIMD32 branch and Igalia's FP64 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	17decd940c	i965/fs: fix subreg_offset overflow in byte_offset() This can happen if the register already has a non-zero subreg_offset when byte_offset() is called. v2 (Sam): - Refactor byte_offset() (Jordan). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Kenneth Graunke	2fd79ebe8f	i965: Fix JIP to skip over sibling do...while loops. We've apparently always been botching JIP for sequences such as: do cmp.f0.0 ... (+f0.0) break ... do ... while ... while Because the "do" instruction doesn't actually exist, the inner "while" is at the same depth as the "break". brw_find_next_block_end() thus mistook the inner "while" as the end of the loop containing the "break", and set the "break" to point to the wrong place. Only "while" instructions that jump before our instruction are relevant. We need to ignore the rest, as they're sibling control flow nodes (or children, but this was already handled by the depth == 0 check). See also commit `1ac1581f38`. This prevents channel masks from being screwed up, and fixes GPU hangs() in dEQP-GLES31.functional.shaders.multisample_interpolation. interpolate_at_sample.centroid_qualified.multisample_texture_16. The test ended up executing code with no channels enabled, and that code contained FIND_LIVE_CHANNEL, which returned 8 (out of range for a SIMD8 program), which then was used in indirect GRF addressing, which randomly got a boolean value (0xFFFFFFFF), interpreted it as a sample ID, OR'd it into an indirect send message descriptor, which corrupted the message length, sending a pixel interpolator message with mlen 15, which is illegal. Whew :) () Technically, the test doesn't GPU hang currently, but only because another bug prevents it from issuing pixel interpolator messages entirely...with that fixed, it hangs. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 00:20:07 -07:00
Kenneth Graunke	2f02fad6b3	i965: Make a "does this while jump before our instruction?" helper. I need to use this in an additional place. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 00:19:53 -07:00
Kenneth Graunke	b6f250d7f2	i965: Send the minimal number of STATE_BASE_ADDRESS packets. STATE_BASE_ADDRESS stalls the whole pipeline, and the documentation cautions us to emit it as little as possible for better performance. We recently put some hacks in BLORP to try and avoid emitting it if it was already set correctly. However, this wasn't quite minimal: if BLORP is the first operation (i.e. glClear()), then it would emit it, and subsequent draw calls would emit it again. This caused a small drop in performance in GPUTest Triangle when switching from Meta to BLORP. Unlike most packets, STATE_BASE_ADDRESS isn't influenced by GL state: it needs to be emitted once per batch, before most other commands, or whenever we change the program cache BO. It's also valid in both the 3D and compute pipelines, which makes it even more unique. This patch removes it from the atom mechanism and instead directly calls it as part of every draw, compute dispatch, or BLORP operation. We introduce a new flag indicating that STATE_BASE_ADDRESS has already been emitted this batch, and if so, skip doing it again. When we make a new program cache BO, we simply reset the flag, so the next operation will emit it again. When we flush/reset the batch, we reset the flag. This guarantees that we'll emit STATE_BASE_ADDRESS only when we have to. It's also less code than the old atom mechanism. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-16 00:11:51 -07:00
Kenneth Graunke	97179c606c	i965: Combine Gen4-7 and Gen8+ state base address emitters. We're about to start calling it directly, and this means the callers won't have to think about generations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-16 00:11:50 -07:00
Kenneth Graunke	7b70a12e1c	i965: Move Gen4-5 programs to brw_upload_programs() too. This way all the programs are in one place again, and it also should make some future STATE_BASE_ADDRESS related changes possible. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-16 00:11:49 -07:00
Kenneth Graunke	b23b099a0b	i965: Mark brw const in brw_state_dirty and callers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-16 00:11:43 -07:00
Kenneth Graunke	8e71ac731b	glsl: Don't do constant propagation in opt_constant_folding. opt_constant_folding is supposed to fold trees of constants into a single constant. Surprisingly, it was also propagating constant values from variables into expression trees - even when the result couldn't be folded together. This is opt_constant_propagation's job. The ir_dereference_variable::constant_expression_value() method returns a clone of var->constant_value. So we would replace the dereference with a constant, propagating it into the tree. Skip over ir_dereference_variable to avoid this surprising behavior. However, add code to explicitly continue doing it in the constant propagation pass, as it's useful to do so. shader-db statistics on Broadwell: total instructions in shared programs: 8905349 -> 8905126 (-0.00%) instructions in affected programs: 30100 -> 29877 (-0.74%) helped: 93 HURT: 20 total cycles in shared programs: 71017030 -> 71015944 (-0.00%) cycles in affected programs: 132456 -> 131370 (-0.82%) helped: 54 HURT: 45 The only hurt programs are by a single instruction, while the helped ones are helped by 1-4 instructions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-15 23:59:39 -07:00
Kenneth Graunke	db8fcbbaf9	glsl: Avoid excess tree walking when folding ir_dereference_arrays. If an ir_dereference_array has non-constant components, there's no point in trying to evaluate its value (which involves walking down the tree and possibly allocating memory for portions of the subtree which are constant). This also removes convoluted tree walking in opt_constant_folding(), which tries to fold constants while walking up the tree. No need to walk down, then up, then down again. We did this for swizzles and expressions already, but I was lazy back in the day and didn't do this for ir_dereference_array. No change in shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-15 23:59:33 -07:00
Kenneth Graunke	329fe93210	glsl: Consolidate duplicate copies of constant folding. We could probably clean this up more (maybe make it a method), but at least there's only one copy of this code now, and that's a start. No change in shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-15 23:59:20 -07:00
Kenneth Graunke	3bf27a9a00	glsl: Remove bonus tree walking in opt_constant_folding(). It looks like this was missed when converting opt_constant_folding() from a hierarchical visitor to an rvalue visitor in `6606fde3`. ir_rvalue_visitor already processes values on the way back up the tree, so we will have already visited every child node. There's no point in doing it again. No change in shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-15 23:59:10 -07:00
Kenneth Graunke	8e59670bcf	glsl: Make opt_constant_variable() bail in useless cases. The pass ultimately skips over any entries with assignment_count != 1, so there's no need to do further work once we've determined that there are multiple assignments. The constant value could be a large array (i.e. uvec4[327]), at which point skipping the constant_expression_value() call (and the clone() call within) can save us piles of memory. No change in shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-15 23:59:05 -07:00
Kenneth Graunke	c907ca6c8d	i965: Flip interpolateAtOffset's y offset when necessary. Fixes 4 dEQP-GLES31.functional.shaders.multisample_interpolation tests: - interpolate_at_offset.no_qualifiers.default_framebuffer - interpolate_at_offset.centroid_qualifier.default_framebuffer - interpolate_at_offset.sample_qualifier.default_framebuffer - interpolate_at_offset.array_element.default_framebuffer Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-15 23:50:52 -07:00
Kenneth Graunke	6d65b0c6dc	nir: Add a nir->info.uses_interp_var_at_offset flag. I've added this to nir_gather_info(), but also to glsl_to_nir() as a temporary measure, since the i965 GL driver today doesn't use nir_gather_info() yet. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-15 23:50:28 -07:00
Kenneth Graunke	d4d7e1516b	glsl: Drop bad ASSERT_TRUE in gl_CullDistance link_varyings test. I don't know what the intention was here, but this function returns void. We can't assert anything about its return value. Fixes "make check" failures. v2: Also fix prototype for the function (caught by Jordan). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-15 23:49:19 -07:00
Jan Vesely	9525f33164	clover: Handle PIPE_SHADER_IR_NIR in switch Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-15 20:05:10 -04:00
Rob Clark	277818ecfb	freedreno/ir3: small standalone compiler cleanup Don't hard-code the gpu-id anymore. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-15 17:25:48 -04:00
Rob Clark	f06343d6ea	nir: forward-declare 'struct gl_shader_program' Drop extra #include which is otherwise unneeded (and makes this header difficult to include from outside of src/mesa). Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-15 17:25:48 -04:00
Rob Clark	79d6409a14	nir: return progress from lower_idiv With algebraic-opt support for lowering div to shift, the driver would like to be able to run this pass after the main opt-loop, and then conditionally re-run the opt-loop if this pass actually lowered some- thing. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-15 17:25:48 -04:00
Rob Clark	f8840f471d	freedreno/ir3: lower fdiv Not sure how we didn't hit this already, but since we want fdiv converted into mul + rcp, we should set this. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-15 17:25:48 -04:00
Rob Clark	53cde5e295	freedreno/ir3: handle VARYING_SLOT_PNTC In the glsl->tgsi path, this already gets translated to VAR8, which matches up with rasterizer->sprite_coord_enable. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-15 17:25:48 -04:00
Rob Clark	2f1581059b	freedreno/ir3: disable TGSI specific hacks in nir case When we got NIR directly from state tracker (vs using tgsi_to_nir) we need to realize this and skip some TGSI specific hacks. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-15 17:25:48 -04:00
Rob Clark	784086f3c1	freedreno/ir3: add support for NIR as preferred IR For now under debug flag, since only suitable for debugging/testing. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-15 17:25:47 -04:00
Rob Clark	8b24f7b440	nir: fix comment typo about f2d/d2f Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-15 17:25:47 -04:00
Ilia Mirkin	be2b13e3bf	nv50/ir: avoid asserts when the state tracker feeds us bogus inputs INTERP is defined (by me) to have to have a INPUT source. However the state tracker does not always obey this. This happens due to varying packing logic introducing additional mov's which can't always be undone. Instead of just giving up, we instead try harder to find the original input. This won't always be possible, for example with indirect accesses. There's not much we can (easily) do about that though. This fixes the remaining interpolateAt* failures in dEQP: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at* some of which were asserting due to INTERP_* being passed a non-input. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-15 14:12:56 -04:00
Ilia Mirkin	9323d084ac	nvc0: don't try to go through the push path for indirect draws This fixes dEQP-GLES31.functional.draw_indirect.draw_elements_indirect.*.default_attribute These tests were causing a const vbo to be set up, and were small enough draws that the logic was trying to go via the push path (which emits data directly into the cmd stream rather than uploading a user vbo). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-15 10:48:39 -04:00
Ilia Mirkin	2ef3cdb07e	nvc0/ir: make sure to align the second arg of TXD to 4, as we do for TEX This was handled in handleTEX(), however the way the logic works, those extra arguments aren't added on by then, so it did nothing. Instead we must duplicate that bit here. GK110 appears to complain about MISALIGNED_GPR, however it's reasonable to believe that GK104 has the same requirements. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95403 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-15 10:48:39 -04:00
Tobias Klausmann	8c02939794	nv50,nvc0: add support for cull distances Cull distances are just a special case of clip distances as far as the hardware is concerned. Make sure that the relevant "planes" are enabled, and flip the clip mode to cull for those. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> [imirkin: add enables on nvc0, add nv50 support] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2016-05-15 10:48:39 -04:00
Ilia Mirkin	2ad970ecf4	st/mesa: disable cull distance for now The pass that st/mesa relies on to combine clip and cull distances has been reverted, so we can't expose ARB_cull_distance until that is resolved. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-15 10:48:38 -04:00
Jason Ekstrand	09e041d61d	i965: Use blorp for all clears We used to use a meta path on gen8 but we haven't since `c7cf17ae75`. We might as well delete the meta path since blorp works on all gens. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	1cfb4bc890	i965: Use blorp for all stencil blits We used to use a meta path because blorp didn't support 16x MSAA. Now it does, so we don't need the meta paths anymore. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	64f2907030	i965: Use blorp for all updownsample blits We used to use a meta path because blorp didn't support 16x MSAA. Now it does, so we don't need the meta paths anymore. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	f5febc83a7	i965/blorp: Add support for 16x MSAA Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	a32315bd19	i965: move brw_meta_set_fast_clear_color to brw_meta_util.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	36529f670f	i965; Move brw_meta_get_*_rect to brw_meta_util.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	21034f1b08	i965: Move brw_is_color_fast_clear_compatible to brw_meta_util Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	b05c68fc8a	i965: Move brw_get_rb_for_slice to brw_meta_util Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	672cffee0f	i965/blorp: Get rid of the blorp_prog_data_int() helper The helper was initially created to allow us to set reasonable defaults as we mutated the brw_blorp_prog_data structure in preparation for NIR. Now that everything is going through brw_blorp_compile_nir_shader() which fully fills out the brw_blorp_prog_data structure, we don't need the helper. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:54 -07:00
Jason Ekstrand	c228ea8345	i965/blorp: Delete the old blorp shader emit code Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:54 -07:00
Jason Ekstrand	c18da26abf	i965/blorp: Stop doing f2i(i2f(sample_id)) NIR gets kind of awkward when you have a 3-component vector with two floats and one int. This led to us accidentally going through float for the sample index. It doesn't hurt anything but it also isn't needed. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	e503da61c6	i965/blorp: Refactor coordinate munging The original code-flow tried to map original blorp. This puts things more where they belong and simplifies some of the logic. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	8636937dd6	i965/blorp: Add bilinear blending support to the NIR path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	6bd7bd6633	i965/blorp: Add support for averaging resolves to the NIR path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	c7269c1551	i965/blorp: Add MSAA encode/decode support to the NIR path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	df8c2936cd	i965/blorp: Add support for W-[de]tiling to the NIR path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	6adb8d6d3a	i965/blorp: Add support for discard-based bounds checks to the NIR path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	4bdace0791	i965/blorp: Add initial support for NIR-based blit shaders Many of the more complex cases still fall back to the old shader builder. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	b0275ad0c9	i965/blorp: Refactor getting the blit kernel into a helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	6df3d75206	i965/blorp: Use NIR for clear shaders Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95373 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	bb45f42f55	i965/blorp: Create the program key in get_clear_kernel There's no reason to be passing a whole struct around just for a single boolean. We can create it later when we actually need to use it as a key. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	c1fe8859d3	i965/blorp: Add a helper for compiling NIR shaders Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	353eadb170	blorp: Add initial state setup support for SIMD8 dispatch Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:52 -07:00
Jason Ekstrand	cd5a2905cf	i965/blorp: Add a param array to prog_data This array allows the push constants to be re-arranged on upload. The actual arrangement will, eventually, come from the back-end compiler. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:52 -07:00
Jason Ekstrand	c46cbe19f4	i965/blorp: Add a prog_data_init helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:52 -07:00
Jason Ekstrand	50e5e1f747	i965/fs: Implement the new NIR MCS texturing Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:49 -07:00
Jason Ekstrand	f47faa4316	nir: Add texture opcodes and source types for multisample compression Intel hardware does a form of multisample compression that involves an auxilary surface called the MCS. When an MCS is in use, you have to first sample from the MCS with a special opcode and then pass the result of that operation into the next sample instrucion. Normally, we just do this ourselves in the back-end, but we want to expose that functionality to NIR so that we can use MCS values directly in NIR-based blorp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:44 -07:00
Jason Ekstrand	87a41e862b	nir/builder: Add a helper for grabbing multiple channels from an ssa def This is similar to nir_channel except that it lets you grab more than one channel by providing a mask. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:40 -07:00
Jason Ekstrand	fc58cb543f	nir/builder: Generate the alu helpers directly in python There's no reason for having a macro and a python generator. We can easily just do the whole thing in python. This has the advantage that we are no longer definining ALU# macros which conflict with the ones in brw_fs_builder.h. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:38 -07:00
Jason Ekstrand	a0e6e5f21f	i965/fs: Use MRF0 for the repclear message This is what BLORP does. Making them match cuts down on the noise when looking at AUB diffs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:33 -07:00
Jason Ekstrand	5a68df87da	i965/blorp: Simplify the sample layout calculation Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:30 -07:00
Jason Ekstrand	bee160b31b	i965/fs: Organize prog_data by ksp number rather than SIMD width The hardware packets organize kernel pointers and GRF start by slots that don't map directly to dispatch width. This means that all of the state setup code has to re-arrange the data from prog_data into these slots. This logic has been duplicated 4 times in the GL driver and one more time in the Vulkan driver. Let's just put it all in brw_fs.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:25 -07:00
Jason Ekstrand	7be100ac9a	i965/gen7_wm: Move where we set the fast clear op This better matches gen8 state setup Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:21 -07:00
Jason Ekstrand	1ec466d0ff	i965/fs: Stop setting dispatch_grf_start_reg from the visitor Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:18 -07:00
Jason Ekstrand	082768af30	i965/fs: Clean up the logic in compile_fs a bit Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:13 -07:00
Jason Ekstrand	b0f8768905	i965/state: Clean up WM/PS state to pull more things out of prog_data Now that we have a persample_shading bit in prog_data we can reduce the amount the state setup code needs to be looking at the GL state. In particular, it no longer pulls anything directly out of the gl_fragment_program and no longer depends on NEW_FRAGMENT_PROGRAM. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:10 -07:00
Jason Ekstrand	712a980add	i965/fs: Rework the persample shading key/prog_data bits This commit reworks and simplifies the way we handle persample shading in the shader key and prog_data. The previous approach had three different key bits that had slightly different and hard-to-decern meanings while the new bits are far more clear. This commit changes it to two easily understood bits that communicate everything we need: 1) key->persample_interp: means that the user has requested persample interpolation through the API. This is equivalent to having SAMPLE_SHADING enabled and having MIN_SAMPLE_SHADING_VALUE set high enough that you actually get multiple per-sample invocations. 2) key->multisample_fbo: means that the shader will be running on an actual multi-sampled framebuffer. This commit also adds a new "persample_dispatch" bit to prog_data which indicates that the shader should be run in persample mode. This way the state setup code doesn't have to look at the fragment program or GL state and can just pull that data out of the prog_data. In theory, this shuffle could mean more recompiles. However, in practice, we were shoving enough state into the key before that we were probably hitting a recompile on every per-sample shader anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:05 -07:00
Jason Ekstrand	a2f50d87b6	nir: Add an info bit for uses_sample_qualifier Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:33:52 -07:00
Kenneth Graunke	59156b2e96	i965: Fix undefined df bits in brw_reg comparisons. Commit `5310bca024` added a new "double df" field to the brw_reg struct, adding an extra 4 bytes of data that isn't usually initialized (or may contain irrelevant garbage if the struct is mutated). This means that it's no longer safe to memcmp(). Instead, add a brw_regs_equal() function which ignores the extra df bits unless they matter. To keep the implementation cheap, we wrap the first set of fields in a union/struct so that we can use a single DWord comparison. v2: Drop unnecessary casts (caught by Francisco Jerez). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-05-14 00:18:37 -07:00
Dave Airlie	9f8867d877	i965: disable cull distance temporarily. I'll fix this up on Monday, so leave the docs changes in place. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-14 11:39:34 +10:00
Dave Airlie	7a6d55826e	Revert "glsl: Extend lowering pass for gl_ClipDistance to support other arrays (v4)" This reverts commit `ad355652c2`. This broke a bunch of clip tests.	2016-05-14 11:39:34 +10:00
Ian Romanick	a608e946b5	docs: Mark GL_OES_shader_io_blocks as started Watch the oes_shader_io_blocks of my fd.o Mesa GIT repo for progress. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-13 17:48:46 -07:00
Kristian Høgsberg Kristensen	4e959cf9f9	docs: update ARB_cull_distance status. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-05-13 16:32:14 -07:00
Kristian Høgsberg Kristensen	c564348a2e	i965: Add support for GL_ARB_cull_distance Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-13 16:28:25 -07:00
Ilia Mirkin	a1c2444792	st/mesa: flip y coordinate of interpolateAtOffset for winsys This fixes a few dEQP tests like dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.no_qualifiers.default_framebuffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-13 19:17:41 -04:00
Ilia Mirkin	0d8e850195	glsl: make sure that textureProj(bias) variants are only exposed in fs Many were already marked as fs_only, but not all. This fixes the remaining ir_txb entries. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-13 19:17:26 -04:00
Ilia Mirkin	37c8f4c609	glsl: be more strict when validating shader inputs interpolateAt* can only take input variables or an element of an input variable array. No structs. Further, GLSL 4.40 relaxes the requirement to allow swizzles, so enable that as well. This fixes the following dEQP tests: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.negative.interpolate_struct_member dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.interpolate_struct_member dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.negative.interpolate_struct_member Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-13 19:17:26 -04:00
Ilia Mirkin	5239f1e0c9	glsl: make sure that interpolateAt arguments are variables In the case of a constant, it might have been propagated through and variable_referenced() returns NULL. Error out in that case. Fixes 3 dEQP tests: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.negative.interpolate_constant dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.interpolate_constant dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.negative.interpolate_constant Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-13 19:17:26 -04:00
Tobias Klausmann	8f45f4f3ca	mesa/st: Add support for GL_ARB_cull_distance (v2) v2: don't bother with cull dist varyings except to assert. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-14 08:28:23 +10:00
Tobias Klausmann	2be258ea18	gallium: Add a pipe cap for arb_cull_distance This lets us safely enable or disable the extension as needed Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-14 08:28:17 +10:00
Tobias Klausmann	d656736bbf	glsl: Add arb_cull_distance support (v3) v2: make too large array a compile error v3: squash mesa/prog patch to avoid static compiler errors in bisect Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-14 08:28:08 +10:00
Tobias Klausmann	ad355652c2	glsl: Extend lowering pass for gl_ClipDistance to support other arrays (v4) This will come in handy when we want to lower gl_CullDistance into gl_CullDistanceMESA. [airlied: drop separate APIs for clip/cull - just use single API to call both passes.] v3: reexamine my sanity, this was pretty broken, the new code creates one copy of gl_ClipDistanceMESA, as the clip distance varying and lowers everything into that in two passes, one for clips one for culls. v4: rework using the passes in clip/cull sizes, instead of the array sizes. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-14 08:28:07 +10:00
Dave Airlie	dd3390e12f	glsl: rename lower_clip_distance to lower_distance. This just renames the file in anticipation of adding cull lowering, and renames the internals. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-14 08:27:40 +10:00
Tobias Klausmann	eb18fea707	mesa/main: Add support for GL_ARB_cull_distance (v2) airlied: v2: rename LowerClipDistance to LowerCombinedClipCullDistnace. I don't think we want any other behaviour with any current hw. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-14 08:27:29 +10:00
Tobias Klausmann	f2a2e08e01	glapi: Add GL_ARB_cull_distance Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-14 08:27:10 +10:00
Nanley Chery	6674d018f7	anv/copy: Fix copying Images from Buffers with larger dimensions This function previously assumed that the Buffer and Image had matching dimensions. However, it is possible to copy from a Buffer with larger dimensions than the Image. Modify the copy function to enable this. v2: Use ternary instead of MAX for setting bufferExtent (Jason Ekstrand) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95292 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Tested-by: Matthew Waters <matthew@centricular.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-13 14:08:57 -07:00
Maarten Lankhorst	ff5c312623	.mailmap: Fix my email addresses. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@ubuntu.com>	2016-05-13 12:28:05 +02:00
Nicolai Hähnle	a694c20ecf	radeonsi/sid_tables: rename reg_table to sid_reg_table This is purely cosmetic, making it easier to assign blame for space used in the binary in case somebody else makes a similar cleanup effort in the future. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:39 -05:00
Nicolai Hähnle	c7f73a70f0	radeonsi/sid_tables: store offset into global fields table instead of pointer This avoids relocations in the final binary. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:39 -05:00
Nicolai Hähnle	54ab39caaf	radeonsi/sid_tables: store strings by offset instead of by pointer This saves some space and avoids the need for relocations. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:39 -05:00
Nicolai Hähnle	ca8f71f4cb	r600: remove TABLE_SIZE macro Use ARRAY_SIZE instead. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:38 -05:00
Nicolai Hähnle	43ac091e4c	r600: move alu_op_table to .c file So that it gets compiled and emitted only once, saving space is the final binary. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:38 -05:00
Nicolai Hähnle	390c740b99	r600: move cf_op_table to .c file So that it gets compiled and emitted only once, saving space is the final binary. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:38 -05:00
Nicolai Hähnle	a180e1d22d	r600: move fetch_op_table to .c file So that it gets compiled and emitted only once, saving space is the final binary. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:37 -05:00
Nicolai Hähnle	6d350fb13f	r600: protect r600_isa.h with extern "C" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:37 -05:00
Bas Nieuwenhuizen	ac77fb74a0	gallium/ddebug: Implement launch_grid. Does not implement dumping info. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-13 07:43:46 +02:00
Bas Nieuwenhuizen	22b35122fa	gallium/ddebug: Support compute states. v2: Reuse the macro for bind & delete. Note that may not be able to share the delete long-term as pipe_compute_state contains members not in pipe_shader_state, and we need to distinguish the pointer location if we add that struct to the union. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-13 07:43:37 +02:00
Bas Nieuwenhuizen	5efe477b13	gallium/ddebug: Add passthrough for get_compute_param. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-13 07:39:12 +02:00
Ian Romanick	8f05a0a4c0	nir: Remove empty visit_call_src and visit_load_const_src functions The guts were removed in `dfb3abba`. It has been almost exactly a year, so I dont think we're going to "decide we want [predication] back." Silences several "unused parameter" warnings: nir/nir.c: In function ‘visit_call_src’: nir/nir.c:1052:32: warning: unused parameter ‘instr’ [-Wunused-parameter] visit_call_src(nir_call_instr instr, nir_foreach_src_cb cb, void state) ^ nir/nir.c:1052:58: warning: unused parameter ‘cb’ [-Wunused-parameter] visit_call_src(nir_call_instr instr, nir_foreach_src_cb cb, void state) ^ nir/nir.c:1052:68: warning: unused parameter ‘state’ [-Wunused-parameter] visit_call_src(nir_call_instr instr, nir_foreach_src_cb cb, void state) ^ nir/nir.c: In function ‘visit_load_const_src’: nir/nir.c:1058:44: warning: unused parameter ‘instr’ [-Wunused-parameter] visit_load_const_src(nir_load_const_instr instr, nir_foreach_src_cb cb, ^ nir/nir.c:1058:70: warning: unused parameter ‘cb’ [-Wunused-parameter] visit_load_const_src(nir_load_const_instr instr, nir_foreach_src_cb cb, ^ nir/nir.c:1059:28: warning: unused parameter ‘state’ [-Wunused-parameter] void *state) ^ v2: Add some comments in nir_foreach_src suggested by Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Connor Abbott <cwabbott0@gmail.com>	2016-05-12 16:47:14 -07:00
Ian Romanick	098166e1bc	nir: Silence unused parameter warnings These cases had the parameter removed: nir/nir_lower_vec_to_movs.c: In function ‘try_coalesce’: nir/nir_lower_vec_to_movs.c:124:66: warning: unused parameter ‘shader’ [-Wunused-parameter] try_coalesce(nir_alu_instr vec, unsigned start_idx, nir_shader shader) ^ nir/nir_lower_io.c: In function ‘load_op’: nir/nir_lower_io.c:147:32: warning: unused parameter ‘state’ [-Wunused-parameter] load_op(struct lower_io_state state, ^ These cases had the parameter (void) silenced because the parameter was necessary for an interface: nir/glsl_to_nir.cpp:1900:32: warning: unused parameter 'ir' [-Wunused-parameter] nir_visitor::visit(ir_barrier ir) ^ nir/nir.c: In function ‘remove_use_cb’: nir/nir.c:802:35: warning: unused parameter ‘state’ [-Wunused-parameter] remove_use_cb(nir_src src, void state) ^ nir/nir.c: In function ‘remove_def_cb’: nir/nir.c:811:37: warning: unused parameter ‘state’ [-Wunused-parameter] remove_def_cb(nir_dest dest, void state) ^ Number of total warnings in my build reduced from 2543 to 2538 (reduction of 5). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-12 16:46:41 -07:00
Leo Liu	bd9ae72459	vl/dri: fix close fd error out fd should be set to -1 only if it got closed by pipe_loader_release. Signed-off-by: Leo Liu <leo.liu@amd.com>	2016-05-12 18:26:48 -04:00
Samuel Pitoiset	988b09f9ac	nvc0: fix indentation in nvc0_invalidate_resource_storage() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-12 21:37:08 +02:00
Samuel Pitoiset	abb3401095	nvc0: save some CPU cycles in nvc0_context_unreference_resources() This reduces the number of loop iterations for invalidating buffers and images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-12 21:37:08 +02:00
Samuel Pitoiset	b8f0b00a9a	nvc0: invalidate texture buffers for compute This is a pretty rare situation but this can happen though. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-12 21:37:08 +02:00
Tim Rowley	2785f2f2d7	swr: properly expose compressed format support Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-12 14:12:18 -05:00
Jason Ekstrand	5186545d66	anv: Don't advertise shaderImageGatherExtended We don't actually support all of the extended gather functionality so we shouldn't be advertising it.	2016-05-12 10:57:00 -07:00
Rob Clark	9d3cc80b75	nir: glsl_get_bit_size() should take glsl_type It's what all the call-sites once, so gets rid of a bunch of inlined glsl_get_base_type() at the call-sites. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-12 13:39:40 -04:00
Topi Pohjolainen	b19cff1639	i965/gen9: Enable lossless compression I tried first creating the auxiliary buffer the same time with the color buffer. That, however, led me into a situation where we would later create the rest of the mip-levels and the compression would need to be disabled (it is only supported for single level buffers). Here we try to create it on demand just before the hardware starts to render. This is similar what we do with fast clear buffers, their creation is deferred until the first clear. This setup also gives the opportunity to detect if the miptree represents the temporaty texture used internally in the mesa core. This texture is mostly written by cpu and therefore enabling compression for it doesn't make much sense. Note that a heuristic is included. Floating point formats are not enabled yet as they are only seen to hurt performance. Some highlights with window system driver kept fixed to default and only the application driver changing: Manhattan: 8.32152% +/- 0.355881% Offscreen: 9.09713% +/- 0.340763% Glb trex: 8.46231% +/- 0.460624% Offscreen: 9.31872% +/- 0.463743% v2 (Ben): Re-use msaa layout type for single sampled case. v3: Moved the deferred allocation of mcs to brw_try_draw_prims() and brw_blorp_blit_miptrees() instead. v4: (Ken): Drop MIPTREE_LAYOUT_ACCELERATED_UPLOAD when allocating mcs. Do not enable for scanout buffers Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	cd9e97a020	i965: Set render state for lossless compressed v2: Add support for blorp and removed the support for meta v3 (Ben): Add assertion on compressed non-fast clear - must be partial clear. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	cda8c2a911	i965/wm: Don't sample lossless compressed as multisampled Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	683dda0083	i965/gen9: Setup MCS for compressed texture surfaces Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	1a05aeeb1c	i965/blorp: Do not resolve lossless compressed blit sources Blorp blits use sampling engine which is capable of resolving on the fly. Buffers are still resolved for blitter engine. Current understanding is that blitter doesn't understand lossless compression. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	01ba26d0b0	i965/blorp: Prepare blits for lossless compression Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	84066ebd63	i965: Deferred allocation of mcs for lossless compressed Until now mcs was associated to single sampled buffers only for fast clear purposes and it was therefore the responsibility of the clear logic to allocate the aux buffer when needed. Now that normal 3D render or blorp blit may render with mcs enabled also, they need to prepare the mcs just as well. v2: Do not enable for scanout buffers v3 (Ben): - Fix typo in commit message. - Check for gen < 9 and return early in brw_predraw_set_aux_buffers() - Check for gen < 9 and return early in intel_miptree_prepare_mcs() v4: Check for msaa_layput and number of samples to determine if lossless compression is to used. Otherwise one cannot distuingish between fast clear with and without compression. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:26 +03:00
Topi Pohjolainen	1ca02b6ebb	i965: Add flag telling if miptree is for client consumption Consider later on adding specific disable flags such as MIPTREE_LAYOUT_DISABLE_AUX_MCS = 1 << 3, /* CCS_D */ MIPTREE_LAYOUT_DISABLE_AUX_CCS_E = 1 << 4, MIPTREE_LAYOUT_DISABLE_AUX = MIPTREE_LAYOUT_DISABLE_AUX_MCS \| MIPTREE_LAYOUT_DISABLE_AUX_CCS_E, and equivalent boolean/enums into miptree. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	a6e0f1cc7f	i965: Add helper for lossless compression support v2: Check explicitly against base type of GL_FLOAT instead of using _mesa_is_format_integer_color(). Otherwise we miss GL_UNSIGNED_NORMALIZED. v3 (Ben): Also call intel_miptree_supports_non_msrt_fast_clear() in order to really check everything. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	874c5f05db	i965/gen9: Prepare surface state setup for lossless compression v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression (intel_miptree_is_lossless_compressed()). v3 (Ben): Do not set fast claer state in surface state setup. Moved into brw_postdraw_set_buffers_need_resolve() using a separate patch. v4: Support for blorp v5 (Ben): Re-use gen8_get_aux_mode() Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	a8544267fd	i965/gen8: Expose auxiliary mode resolver Also use the opportunity to drop the unused surface type argument. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	94926492d8	i965: Relax assertion of halign == 16 for lossless compressed aux Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	ba9f954e60	i965/blorp: Set full resolve for lossless compressed v2 (Ben): Introduce union for fast clear and resolve ops Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	58e7392e12	i965/blorp: Do not skip fast color clear with new color This hasn't been visible before. It showed up with lossless compression with: dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.rgb8 Current fast clear logic kicks color resolves even for gpu sampling. In the test case this results into trashing of the fast color clear state between two subsequent clears, and therefore each clear is performed correctly. With lossless compression the resolves are unnecessary and therefore the clear state indicates that the buffer is already cleared. Without considering if the previous color value was the same as the new, clears that need to be performed are skipped and the buffer ends up holding old pixel values. v2 (Ken): Fix the comparison for gen < 9 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:48:47 +03:00
Kenneth Graunke	12dcad1b42	i965: Enable scalar GS by default. I'd originally left this off because Orbital Explorer was hanging the GPU, but it seems to be working these days. There have been a bunch of changes since then, so we probably fixed something. On my Broadwell laptop, both Synmark/GSCloth and Orbital Explorer seem to run at approximately the same framerate in either mode. This is despite large reductions in instruction count for Synmark, and large increases for Orbital Explorer. It apparently just doesn't matter. Switching to scalar mode will gain us fp64 support in the next release, as vec4-mode support isn't yet ready. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-12 01:01:42 -07:00
Kenneth Graunke	607fb0f13d	i965: Reduce the SIMD8 GS push constant threshold from 32 to 24. Three Shadow of Mordor geometry shaders increase by a single instruction, but the number of spills/fills in Orbital Explorer is reduced from 194:1279 -> 82:454. No other programs are affected. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-12 01:01:42 -07:00
Kenneth Graunke	3aa542c657	i965: Delete bogus assertion in emit_gs_input_load(). This looks like leftover cruft from an earlier attempt at writing point size hacks. Each vertex has its own copy of gl_PointSize, so accessing any vertex other than 0 would cause this to fail. The tests seem to work fine without it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-12 01:01:42 -07:00
Kenneth Graunke	1c41cb58de	i965: Support instanced GS inputs in the scalar backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-12 01:01:36 -07:00
Kenneth Graunke	5fc3772650	i965: Use an early return for the push case in emit_gs_input_load(). Just trying to keep things from getting too ugly in the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-12 00:59:08 -07:00
Kenneth Graunke	e9ca952581	i965: Drop BRW_NEW_BLORP from stipple and line parameter packets. BLORP never touches these, and they're all non-pipelined. Some are fairly large packets as well. I haven't tried to benchmark this; the effect is likely to be small. However, we may as well stop the pointless papercuts; maybe they'll add up someday. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-12 00:54:37 -07:00
Jakob Sinclair	18f7c88dd6	glsl: fixed uninitialized pointer Class "ir_constant" had a bunch of constructors where the pointer member "array_elements" had not been initialized. This could have lead to unsafe code if something had tried to write anything to it. This patch fixes this issue by initializing the pointer to NULL in all the constructors. This issue was discovered by Coverity. CID: 401603, 401604, 401605, 401610 Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-12 09:46:36 +02:00
Ilia Mirkin	ba3f0b6d59	nvc0: fix gl_SampleMaskIn computation The SAMPLEMASK semantic should only return the bits set covered by the current invocation. However we were always retrieving the covmask, which returns the covered samples of the whole pixel. When not doing per-sample invocation, this is precisely what we want. However when doing per-sample invocation, we have to select the sampleid'th bit and only return that. Furthermore, this means that we have to have a 1:1 correlation for invocations and samples. This fixes most dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.* tests. A few failures remain due to disagreements about nr_samples==1 logic as well as what happens with MSAA x2 RTs when the shading fraction is 0.5. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-11 20:39:27 -04:00
Ilia Mirkin	f5fe903002	nv50/ir: generalize interp fixups to be able to fixup anything Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-11 20:39:26 -04:00
Jason Ekstrand	66a442687f	.mailmap: Update the e-mail addresses for Kristian Høgsberg This changes it to use his personal e-mail and adds his @intel.com address Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-11 12:32:14 -07:00
Jason Ekstrand	7e759fbd60	.mailmap: Use Connor Abbott's personal e-mail	2016-05-11 12:27:15 -07:00
Giuseppe Bilotta	9c3392cb3a	Add .mailmap This adds a first tentative .mailmap file, to canonicize contributor name/emails in shortlogs and other statistical endeavours. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:21:46 -07:00
Jason Ekstrand	f1dcc7976a	i965: Stop splitting fma() prior to optimization According to the GLSL spec, if the user uses the fma() intrinsic to generate a precise-consumed value, and you have it in your hardware, you shouldn't split it. For a while now, we've been splitting all ffma's up-front and then planned to fuse them later which isn't valid. Correctly handling the GLSL behaviour fixes rendering corruptions in Tomb Raider. The only reason why doing this possibly helped before was for ARB programs which is handled by the previous commit. Shader-db results on Haswell: total instructions in shared programs: 7560300 -> 7561510 (0.02%) instructions in affected programs: 56265 -> 57475 (2.15%) helped: 86 HURT: 291 The only shaders in the database that are affected are from "Shadow of Mordor" which is the first app in our database to use fma(). We could, at some point in the future, split inexact ffma opcodes which would fix the shader-db regressions since Shadow of Mordor doesn't ues precise. However, this fixes a bug now and and the shader-db impact is fairly small. Reported-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-11 11:44:35 -07:00
Jason Ekstrand	47f01e538a	ptn: Emit mul+add for MAD Unlike fma() in GLSL, MAD in ARB programs is 100% splittable. Just emit the split version and let the optimizer fuse them later. Shader-db results on Haswell: total instructions in shared programs: 7560379 -> 7560300 (-0.00%) instructions in affected programs: 143928 -> 143849 (-0.05%) helped: 443 HURT: 250 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-11 11:44:35 -07:00
Jason Ekstrand	1b72c31e1f	nir/algebraic: Separate ffma lowering from fusing The i965 driver has its own pass for fusing mul+add combinations that's much smarter than what nir_opt_algebraic can do so we don't want to get the nir_opt_algebraic one just because we didn't set lower_ffma. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-11 11:44:35 -07:00
Rob Clark	5886d1bad1	anv: fix build break Previous rename of lower-output-to-temps pass predated merging of anv, and apparently vulkan wasn't enabled in my local builds so overlooked this when rebasing. Reported-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-11 14:03:24 -04:00
Rob Clark	697382eb61	mesa/st: split the type_size calculation into it's own file We'll want to re-use this for NIR. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-11 12:20:12 -04:00
Rob Clark	0e5a369879	glsl: export accessor for builtin-uniform descriptors We'll need this for a nir pass to lower builtin-uniform access. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:20:12 -04:00
Rob Clark	dfbabc6bad	nir/lower-io: add support for lowering inputs Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:20:11 -04:00
Rob Clark	595f9d5476	nir/lower-io: split out some helper fxns Prep work to reduce the noise in the next patch. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:20:11 -04:00
Rob Clark	b085016f94	nir: rename lower_outputs_to_temporaries -> lower_io_to_temporaries Since it will gain support to lower inputs, give it a more generic name. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-11 12:20:11 -04:00
Rob Clark	47fcef9a20	nir: move callsite of lower_outputs_to_temporaries Going to convert this pass to parameterized lower_io_to_temporaries, and we want the user to be able to specify whether to lower outputs or inputs or both. The restriction of running this pass before validate to avoid output reads no longer applies. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:20:11 -04:00
Rob Clark	5261947260	nir: lower-io-types pass A pass to lower complex (struct/array/mat) inputs/outputs to primitive types. This allows, for example, linking that removes unused components of a larger type which is not indirectly accessed. In the near term, it is needed for gallium (mesa/st) support for NIR, since only used components of a type are assigned VBO slots, and we otherwise have no way to represent that to the driver backend. But it should be useful for doing shader linking in NIR. v2: use glsl_count_attribute_slots() rather than passing a type_size fxn pointer Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:20:11 -04:00
Rob Clark	b10cc24519	nir: passthrough-edgeflags support Handled by tgsi_emulate for glsl->tgsi case. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-11 12:20:11 -04:00
Rob Clark	3a939d034e	nir: add lowering pass for glBitmap Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-11 12:20:11 -04:00
Rob Clark	12c18ce476	nir: add lowering pass for glDrawPixels Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-11 12:20:11 -04:00
Rob Clark	b26645a00f	nir: add lowering pass for y-transform Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-11 12:20:11 -04:00
Rob Clark	e1d80f8603	gallium: add NIR as a possible IR Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-11 12:20:11 -04:00
Rob Clark	425dc4c4b3	gallium: refactor pipe_shader_state to support multiple IR's The goal is to allow the pipe driver to request something other than TGSI, but detect whether what is getting is TGSI vs what it requested. The pipe drivers will always have to support TGSI (and convert that into whatever it is that they prefer), but in some cases we should be able to skip the TGSI intermediate step (such as glsl->nir vs glsl->tgsi->nir). I think pipe_compute_state should get similar treatment. Currently, afaict, it has one user and one consumer, which has allowed it to be sloppy wrt. supporting alternative IR's. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-11 12:20:11 -04:00
Rob Clark	4500d17245	freedreno: fix multi-layer transfer_map's The use of transfer_inline_write() in TexSubImage path (see `fb9fe352ea`) exposed a bug for "layer_first" resources (ie. a4xx) not setting correct layer_stride. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-11 12:03:21 -04:00
Juan A. Suarez Romero	9bea018994	glsl: use var with initializer on global var validation Currently, when cross validating global variables, all global variables seen in the shaders that are part of a program are saved in a table. When checking a variable this already exist in the table, we check both are initialized to the same value. If the already saved variable does not have an initializer, we copy it from the new variable. Unfortunately this is wrong, as we are modifying something it is constant. Also, if this modified variable is used in another program, it will keep the initializer, when it should have none. Instead of copying the initializer, this commit replaces the old variable with the new one. So if we see again the same variable with an initializer, we can compare if both are the same or not. v2: convert tabs in whitespaces (Kenenth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-11 13:50:04 +02:00
Jordan Justen	2c1c060b03	util/ralloc: Remove double zero'ing of rzalloc buffers Juha-Pekka found this back in May 2015: <1430915727-28677-1-git-send-email-juhapekka.heikkila@gmail.com> From the discussion, obviously it would be preferable to make ralloc_size no longer return zeroed memory, but Juha-Pekka found that it would break Mesa. In <56AF1C57.2030904@gmail.com>, Juha-Pekka mentioned that patches exist to fix i965 when ralloc_size is fixed to not zero memory, but the patches have not made their way to mesa-dev yet. For now, let's stop doing the double zeroing of rzalloc buffers. v2: * Move ralloc_size code to rzalloc_size, and add a comment as suggested by Ken. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 22:54:46 -07:00
Jonathan Gray	e3d43dc5ea	genxml: avoid using a GNU make pattern rule % pattern rules are a GNU extension. Convert the use of one to a inference rule to allow this to build on OpenBSD. v2: inference rules can't have additional prerequisites so add a target rule to still depend on gen_pack_header.py Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-10 20:54:33 -07:00
Roland Scheidegger	430797843a	gallivm: improve dumping of bitcode Use GALLIVM_DEBUG=dumpbc for dumping of modules as bitcode. Instead of a fixed llvmpipe.bc name, use ir_<modulename>.bc so multiple modules can be dumped (albeit it might still overwrite previous modules, particularly the modules from draw tend to always have the same name). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-11 04:43:35 +02:00
Vinson Lee	8d639138c7	swr: [rasterizer] Include cmath for std::isnan and std::isinf. This patch fixes this build error. CXX rasterizer/memory/libswrAVX_la-ClearTile.lo In file included from rasterizer/memory/ClearTile.cpp:34:0: ./rasterizer/memory/Convert.h: In function ‘uint16_t Convert32To16Float(float)’: ./rasterizer/memory/Convert.h:170:9: error: ‘__builtin_isnan’ is not a member of ‘std’ if (std::isnan(val)) ^ ./rasterizer/memory/Convert.h:170:9: note: suggested alternative: <built-in>: note: ‘__builtin_isnan’ ./rasterizer/memory/Convert.h:176:14: error: ‘__builtin_isinf_sign’ is not a member of ‘std’ else if (std::isinf(val)) ^ ./rasterizer/memory/Convert.h:176:14: note: suggested alternative: <built-in>: note: ‘__builtin_isinf_sign’ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95180 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-05-10 17:11:05 -07:00
Jason Ekstrand	a5660bf1f8	i965/blorp: Don't blend integer values during MSAA resolves Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 15:32:00 -07:00
Jason Ekstrand	4f4f393bf3	meta/blit: Don't blend integer values during MSAA resolves Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 15:31:50 -07:00
Jason Ekstrand	203c786a73	i965/fs: Default all constants to a location of -1 Otherwise constants which aren't live get an undefined constant location. When we go to set up param and pull_param we end up assigning all unused uniforms to slot 0. This cases the Vulkan driver to segfault because it doesn't have pull_param. This fixes bugs in the Vulkan driver introduced in `c3fab3d000`. Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2016-05-10 15:25:30 -07:00
Dave Airlie	d36d11ad90	st/glsl_to_tgsi: attach image to correct instruction for samples This fixes a crash (but not the test): GL45-CTS.shader_texture_image_samples_tests.functional_test Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-11 06:55:09 +10:00
Dave Airlie	07df3b81ff	mesa: move MESA_MAP_NOWAIT_BIT up away from GL_MAP_PERSISTENT_BIT This was colliding badly and making GL45-CTS.buffer_storage.map_persistent_texture fail on radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-11 06:54:56 +10:00
Dave Airlie	b230d51a18	mesa/meta: check for signed/unsigned int conversion for pbo getteximage When doing GetTexSubImage using a PBO, we should check if it involves a signed/unsigned conversion and bail if it does, just like in the other cases. This fixes: GL33-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo on Haswell at least. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95324 Reviewed-by: Matt Turer <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-11 06:52:20 +10:00
Matt Turner	8bb156a261	i965: Handle BRW_OPCODE_DO on Gen6+ in brw_instruction_name(). This became a problem after the recent disassembler changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 12:12:46 -07:00
Bas Nieuwenhuizen	3d21720d31	radeonsi: Set declared tessellation LDS size to hardware size. The calculated limit gave problems on SI as it was > 32 KiB and the hardware LDS size on SI is only 32 KiB. It isn't correct anyway when processing multiple patches in a threadgroup. As we potentially have any number of patches such that the used LDS is at most the hardware LDS size, and exact size per patch is not known at compile time, this seems like the only valid bound. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-10 20:14:55 +02:00
Rob Clark	8623e599fc	freedreno/ir3: size input/output arrays properly We index into these based on var->data.driver_location, which might have gaps (ie. two inputs, one w/ drvloc 0 and other 2). This shows up in (for example) 'bin/copyteximage 1D', but was only noticed recently due to additional asserts. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-10 13:17:27 -04:00
Ian Romanick	2483a9a08c	ir_to_mesa: Emit smarter ir_binop_logic_or for vertex programs Continue using ADD in the other case because a fragment shader backend could fuse the ADD with a MUL to generate a MAD for ((x && y) \|\| z). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 09:22:18 -07:00
Ian Romanick	f7328f9afd	prog: Delete all remains of OPCODE_SNE, OPCODE_SEQ, OPCODE_SGT, and OPCODE_SLE There is nothing left that can generate them. These used to be generated by ir_to_mesa or by the assembler for various NV extensions that have been removed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 09:22:18 -07:00
Ian Romanick	fd63e77998	ir_to_mesa: Do not emit OPCODE_SEQ or OPCODE_SNE Nothing that consumes the output of this backend consumes them navtively. This is not the way i915 has implemented these instructions, but, as far as I am able to tell, this is the way both the Cg compiler and the HLSL compiler implement these operations. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 09:22:18 -07:00
Ian Romanick	15e6a1a3be	ir_to_mesa: Do not emit OPCODE_SLE or OPCODE_SGT Nothing that consumes the output of this backend consumes them navtively. This is the way i915 has implemented these instructions since it began consuming GLSL. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 09:22:18 -07:00
Samuel Pitoiset	e46ac18ebe	nvc0: enable compute support by default on GK110+ Compute support seems to be pretty stable now, and according to piglit it doesn't seem to break 3D state. As a side effect, this will expose ARB_compute_shader on GK110/GK208. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-10 17:47:01 +02:00
Marek Olšák	2b58bc4461	gallium/radeon: don't flush the GFX IB if DMA doesn't depend on it Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	fb89f06698	radeonsi: consolidate radeon_add_to_buffer_list calls for DMA Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	60946c0d60	gallium/radeon: add a heuristic for better (S)DMA performance Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	bb74152597	gallium/radeon: flush if DMA IB memory usage is too high This prevents IB rejections due to insane memory usage from many concecutive texture uploads. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	70934de00e	radeonsi: add new SDMA texture copy code This implements: - Linear-to-linear partial copies. (unaligned) - Tiled-to-linear and linear-to-tiled partial copies. (unaligned except 1-2 Bpp) - Tiled-to-tiled partial copies aligned to 8x8. v2: Extend the SDMA L2T VM fault workaround to T2L. - Same algorithm, just applied to T2L. (and using a 0-based address and surface.bo_size instead of buf->size) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	a512da36ae	gallium/radeon: fix (S)DMA read-after-write hazards Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	f837c37f02	radeonsi: raise the max size for SDMA buffer copies Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	faa4f0191d	radeonsi: remove SDMA texture copy code Most of this has never worked according to the new test. The new code will be radically different. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	498a40cae8	radeonsi: only expose _init_dma_functions from (S)DMA files just normalizing the interfaces Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	3af28e558f	gallium/radeon: implement randomized SDMA texture copy testing (v2) v2: - adjustments for exercising all important SDMA code paths - decrease the probability of getting huge sizes (faster testing) - increase the probability of getting power-of-two dimensions - change the memory cap to 128MB (faster testing) - better detect which engine has been used Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	f475c9fb07	gallium/radeon: discard CMASK or DCC if overwriting a whole texture by DMA v2: simplify the conditionals Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	2f173b8e13	gallium/radeon: use a common function for DMA blit preparation this is more robust and probably fixes some bugs already Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	2af4b637d8	gallium/radeon: split out code for discarding DCC Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	c85d0c17d9	gallium/radeon: rename r600_texture_disable_cmask -> discard_cmask because it doesn't decompress Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	fb9fe352ea	st/mesa: use transfer_inline_write for memcpy TexSubImage path This allows drivers to use their own fast path for texture uploads. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	871d2aff24	gallium/radeon: fix partial layered transfers of cube (array) textures a staging cube texture with array_size % 6 != 0 doesn't work very well just use 2D_ARRAY or 2D for all staging textures Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	c2377b394b	gallium/radeon: align alignments for better buffer reuse It's for the buffer cache. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	544967faf5	gallium/radeon: use gart_page_size instead of hardcoded 4096 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	bfa8a00920	winsys/radeon: use gart_page_size instead of private size_align Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	9d8c283f28	winsys/amdgpu: move gart_page_size to struct radeon_winsys Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Roland Scheidegger	e4cf8717de	gallivm: print declarations of intrinsics with GALLIVM_DEBUG=ir Those aren't really interesting, however outputting them is helpful when trying to feed the IR to llvm llc (or opt) for debugging. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-10 17:08:16 +02:00
Roland Scheidegger	5c200894c8	gallivm: use InternalLinkage instead of PrivateLinkage for texture functions At least with MCJIT the disassembler will crash otherwise when trying to disassemble such functions. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-10 17:08:16 +02:00
Roland Scheidegger	8b66e2647d	gallivm: disable avx512 features We don't target this yet, and some llvm versions incorrectly enable it based on cpu string, causing crashes. (Albeit this is a losing battle, it is pretty much guaranteed when the next new feature comes along llvm will mistakenly enable it on some future cpu, thus we would have to proactively disable all new features as llvm adds them.) This should fix https://bugs.freedesktop.org/show_bug.cgi?id=94291 (untested) Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com CC: <mesa-stable@lists.freedesktop.org>	2016-05-10 17:08:16 +02:00
Jose Fonseca	94e8653a3b	Revert "nir: Try to warn when C99 extensions are used in nir headers." This reverts commit `99474dc29b`. -Wpedantic is too verbose, even when applied to just a few includes. We'll just have to deal with the issues as they come. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-10 03:29:24 -07:00
Samuel Iglesias Gonsálvez	4c9006f957	i965/fs: fix MOV_INDIRECT exec_size for doubles In that case, the writes need two times the size of a 32-bit value. We need to adjust the exec_size, so it is not breaking any hardware rule. v2: - Add an assert to verify type size is not less than 4 bytes (Jordan). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:09 +02:00
Samuel Iglesias Gonsálvez	75ada43a3a	i965/fs: take into account doubles when calculating read_size for MOV_INDIRECT v2: - Fix assert's line width (Topi). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:09 +02:00
Samuel Iglesias Gonsálvez	03687ab77f	i965/fs: demote_pull_constants() did not take into account double types The constants could be double, and it was allocating size for float types for the destination register of varying pull constant loads. Then the fs_visitor::validate() will complain. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:09 +02:00
Samuel Iglesias Gonsálvez	c3fab3d000	i965/fs: push first double-based uniforms in push constant buffer When there is a mix of definitions of uniforms with 32-bit or 64-bit data type sizes, the driver ends up doing misaligned access to double based variables in the push constant buffer. To fix this, this patch pushes first all the 64-bit variables and then the rest. Then, all the variables would be aligned to its data type size. v2: - Fix typo and improve comment (Jordan). - Use ralloc(NULL,...) instead of rzalloc(mem_ctx,...) (Jordan). - Fix typo (Topi). - Use pointers instead of references in set_push_pull_constant_loc() (Topi). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-10 11:25:09 +02:00
Iago Toral Quiroga	193cb67a84	i965/fs: recognize writes with a subreg_offset > 0 as partial Usually, writes to a subreg_offset > 0 would also have a stride > 1 and we would recognize them as partial, however, there is one case where this does not happen, that is when we generate code for 64-bit imemdiates in gen7, where we produce something like this: mov(8) vgrf10:UD, <low 32-bit> mov(8) vgrf10+0.4:UD, <high 32-bit> and then we use the result with a stride of 0, as in: mov(8) vgrf13:DF, vgrf10<0>:DF Although we could try to avoid this issue by producing different code for this by using writes with a stride of 2, that runs into other problems affecting gen7 and the fact is that any instruction that writes to a subreg_offset > 0 is a partial write so we should really recognize them as such. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:09 +02:00
Iago Toral Quiroga	34ed61b334	i965/fs/lower_simd_width: Fix registers written for split instructions When the original instruction had a stride > 1, the combined registers written by the split instructions won't amount to the same register space written by the original instruction because the split instructions will use a stride of 1. The current code assumed otherwise and computed the number of registers written by split instructions as an equal share based on the relation between the lowered width and the original execution size of the instruction. It is only after the split, when we interleave the components of the result from the lowered instructions back into the original dst register, that the original stride takes effect and we write all the registers specified by the original instruction. Just make the number of register written the same as the vgrf space we allocate for the dst of the split instruction. Fixes crashes in fp64 tests produced as a result of assigning incorrectly the number of registers written by split instructions, which led to incorrect validation of the size of the writes against the allocated vgrf space. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:09 +02:00
Iago Toral Quiroga	9741cff1ec	i965/fs: rename our lower_d2f pass to lower_d2x Since it no longer handles conversions from double to float but from double to various other 32-bit types. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:09 +02:00
Iago Toral Quiroga	efaf62a40a	i965/fs: implement i2d and u2d Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	c63a6f2149	i965/fs: implement d2i and d2u These need the same treatment as d2f, so generalize our d2f lowering to cover these too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	e0c45182e3	i965/fs: implement d2b v2: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	80f60a4302	i965/fs: implement fsign() for doubles v2 (Sam): - Fix indentation (Kenneth) - Simplify code (Kenneth) v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	c9ecd651e6	i965/fs: add null_reg_df Probably not needed since we fix the dst type of comparisons automatically, but for consistency with the rest of null_reg_* functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	e8a8fc9563	i965/fs: We only support 32-bit integer ALU operations for now Add asserts so we remember to address this when we enable 64-bit integer support, as suggested by Connor and Jason. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	9e5ce151a4	i965/fs: handle fp64 opcodes in brw_do_channel_expressions In the case of the pack opcode we are already doing the lowering in NIR, so no need to do it here. The unpack opcode operates on scalars, so it should not be lowered. In the case of frexp_sig and frexp_exp, they are lowered in lower_instructions, so we don't have to care about them. All the remaining opcodes involve conversions from and to doubles and are business as usual. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Connor Abbott	a644b0939d	i965/fs: add support for f2d and d2f Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Connor Abbott	9e1b3ea199	i965/fs: add a pass for legalizing d2f We need to do this late, in order to avoid partial writes during the optimization loop. v2: Use subscript() instead of stride(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Connor Abbott	2286a74e3b	i965/fs: fix dst width calculation in CSE v2 (Sam): - Fix line width (Topi). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:08 +02:00
Connor Abbott	fccd15524f	i965/fs: fix regs_written in LOAD_PAYLOAD for doubles v2: Account for the stride of the dst (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:07 +02:00
Connor Abbott	6b6d68ae07	i965/fs: fix is_copy_payload() for doubles v2 (Sam): - LOAD_PAYLOAD treats each header source as a 32B block regardless of the datatype. Drop the change (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:07 +02:00
Connor Abbott	e83f51d54e	i965/fs: fix compares for doubles The destination has to have the same source as the type, or else the simulator will complain. As a result, we need to emit a CMP that outputs a 64-bit wide result and then do a strided MOV to pick out the low 32 bits of each channel. v2: Use subscript() instead of stride() (Curro) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:07 +02:00
Connor Abbott	a5d7e144ea	i965/fs: extend exec_size halving in the generator The HW has a restriction that only vertical stride may cross register boundaries. Previously, this only mattered for SIMD16 instructions where we needed to use the same regioning parameters as the equivalent SIMD8 instruction but double the exec size. But we need to do the same splitting for 64-bit instructions as well as instructions with a stride of 2 (which effectively consume 64 bits per element). Fix up the code to do the right thing instead of special-casing SIMD16. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:07 +02:00
Connor Abbott	4f3888c1ca	i965/fs: fix assign_constant_locations() for doubles Uniform doubles will read two registers, in which case we need to mark both as being live. v2 (Sam): - Use a formula to get the number of registers read with proper units (Curro). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:07 +02:00
Connor Abbott	cc64c9e441	i965/fs: use byte_offset() in offset() for uniforms This makes things more consistent, and also fixes the offset calculation for double uniforms. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:07 +02:00
Connor Abbott	fe949949a9	i965/fs: handle uniforms in byte_offset() v2: Do it only for uniforms (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Connor Abbott	1f51aada3f	i965/fs: fix type_size() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Iago Toral Quiroga	935e0e305d	i965/fs: optimize unpack double When we are actually unpacking from a double that we have previously packed from its 32-bit components we can bypass the pack operation and source from its arguments directly. v2 (Sam): - Fix line overflow (Topi) - Bail if the parent instruction's source is not SSA (Connor) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Iago Toral Quiroga	ba1907f040	i965/fs: optimize pack double When we are actually creating a double using values obtained from a previous unpack operation we can bypass the unpack and source from the original double value directly. v2: - Style changes (Topi) - Bail is parent instruction's src is not SSA (Connor) v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Connor Abbott	7782f39e75	i965/fs/nir: translate double pack/unpack v2 (Sam): - Fix line overflow (Topi). v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Connor Abbott	fd763177c1	i965/fs: add a pass for lowering PACK opcodes v2: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Connor Abbott	ba582e58cd	i965/fs: add PACK opcode Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Francisco Jerez	cc3bae5cd7	i965/fs: Introduce helper to extract a field from each channel of a register. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-05-10 11:25:05 +02:00
Connor Abbott	d17cdacba3	i965/fs: always pass the bitsize to brw_type_for_nir_type() v2 (Sam): - Add bitsize to brw_type_for_nir_type() in optimize_extract_to_float() v3 (Sam): - Fix line width (Topi). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Connor Abbott	a308bae58f	i965/fs: add support for printing double immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Connor Abbott	0f2e227d5c	i965/fs: don't propagate 64-bit immediates They can only be used with 1-src instructions, which practically (since we should've constant-propagated away all 1-src instructions with 64-bit immediates in NIR) means that they must be kept in separate MOV's and can't be propagated. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Connor Abbott	0f1690fd95	i965/fs: use the NIR bit size when creating registers v2 (Iago): - Squashed bits from 'support double precission constant operands for the implementation of 64-bit emit_load_const'. - Do not use BRW_REGISTER_TYPE_D for all 32-bit registers since that breaks asserts and functionality for some piglit tests. Just keep 32-bit types untouched and add 64-bit support. - Use DF instead of Q for 64-bit registers. Otherwise the code we generate will use Q sometimes and DF others and we hit unwanted DF/Q conversions, so always use DF. v3 (Sam): - Mark 'reg_type' occurrences as const (Topi). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani Palli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:04 +02:00
Connor Abbott	76de7af8e2	i965: fixup uniform setup for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:04 +02:00
Iago Toral Quiroga	3210870b34	i965: two-argument instructions can only use 32-bit immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:04 +02:00
Iago Toral Quiroga	3d10adf603	i965: fix brw_abs_immediate() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:04 +02:00
Iago Toral Quiroga	830d87840c	i965: fix brw_saturate_immediate() for doubles v2 (Sam): - Mark 'size' as const (Topi). - Add comment to explain that we do copies 64-bits regardless of the type (Topi) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:03 +02:00
Connor Abbott	7bcc4cccad	i965: fix is_zero(), is_one() and is_negative_one() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	2ae409286c	i965: fix brw_negate_immediate() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	cbf7c7f099	i965/eu: add support for DF immediates v2 (Sam): - Remove 'however' from the comment (Topi) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	c0a1cd24a8	i965: add support for disassembling DF immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	bb175db16b	i965: add support for getting/setting DF immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	5310bca024	i965: add brw_imm_df v2 (Iago) - Fixup accessibility in backend_reg Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	9add73f641	i965/eu: Allow 3-src float ops with doubles v2: - set 3src_src_type for BRW_REGISTER_TYPE_DF (Connor) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Connor Abbott	367e762a71	i965/disasm: fix disasm of 3-src doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	45066a6a59	i965: Tell backend register about double precision type Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani P\344lli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	520b3b2fd1	i965: Determine size of double precision float register This is used to determine how many registers an instruction reads and writes as well as for offseting register region into a desired component. v2 (Connor): rebase on master Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani P\344lli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	e88cf0f2d2	i965: Lower DFRACEXP/DLDEXP v2 (Connor): rebase on master which moved this to brw_link.cpp v3 (Sam): - Only enable DFREXP_DLDEXP_TO_ARITH in process_glsl_ir(). This is used for doubles. Single floating point op is lowered by NIR. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Connor Abbott	30424fd25a	i965: use pack/unpackDouble lowering Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:01 +02:00
Connor Abbott	bea2f8beb5	i965: use double lowering pass v2: also lower trunc, ceil, floor, fract and roundEven (Iago) v3: also lower mod for doubles (Sam) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:01 +02:00
Samuel Iglesias Gonsálvez	d00a239b28	freedreno/ir3: lower lrp when operating with double operands Lower lrp when operating with double operands because float version of lrp is also lowered. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:01 +02:00
Samuel Iglesias Gonsálvez	93e690830a	i965: enable lrp lowering for doubles Broadwell and previous generations does not support lrp instruction operating with doubles. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:01 +02:00
Dave Airlie	008feb3687	st/glsl_to_tgsi: brown paper bag for the input offsets fix. Oops, thanks compiler. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 14:41:21 +10:00
Dave Airlie	4d8a71f7f1	glsl: check geometry output vertices limits. This fixes: GL45-CTS.geometry_shader.limits.max_output_vertices Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 14:26:03 +10:00
Dave Airlie	13c68e1447	mesa/vbo: fix check for zero aliases with 2/10/10/10 This fixes: GL33-CTS.gtf33.GL3Tests.vertex_type_2_10_10_10_rev.vertex_type_2_10_10_10_rev_attrib Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 14:24:49 +10:00
Eduardo Lima Mitev	60a5d02416	nir/print: Print memory qualifiers in a variable declaration Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-10 06:22:05 +02:00
Eduardo Lima Mitev	7f7f58f17f	glsl: Apply memory qualifiers to vars inside named block interfaces This is missing and memory qualifiers are currently being ignored for SSBOs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-10 06:21:55 +02:00
Dave Airlie	f75a26d1ba	st/glsl_to_tgsi: handle offsets from inputs This fixes: GL45-CTS.gpu_shader5.texture_gather_offset_color_repeat Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 13:14:29 +10:00
Rob Clark	aa730aca20	scripts: bump git_reviewer.pl --git-min-percent default Bump up default percentage of commits required to be auto-picked for CC. Seems from a bit of trial-and-error to come up with a more reasonable list of CC's this way. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 19:30:28 -04:00
Kenneth Graunke	e034d80fe1	Revert "Revert "i965: Switch to scalar TCS by default."" This reverts commit `bd326c229c`. Now that we've fixed the GPU hangs, let's turn it back on. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-09 16:20:27 -07:00
Kenneth Graunke	5ce405ba0f	i965: Actually assign binding table offsets for the TCS. As far as I can tell, this was just entirely missing...honestly, I'm not sure how anything worked at all. Caught by noticing GPU hangs in image load store tests with scalar TCS, but probably has broader implications. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-09 16:20:18 -07:00
Kenneth Graunke	e0e7280db0	i965: Clamp "Maximum VP Index" to 1 when gl_ViewportIndex isn't written. fs_visitor::emit_urb_writes skips writing the VUE header for shaders that don't write gl_PointSize, gl_Layer, or gl_ViewportIndex. This leaves their values uninitialized. Kristian's nearby comment says: "But often none of the special varyings that live there are written and in that case we can skip writing to the vue header, provided the corresponding state properly clamps the values further down the pipeline." However, we were clamping gl_ViewportIndex to [0, 15], so we would end up using a random viewport. To fix this, detect when the shader doesn't write gl_ViewportIndex, and clamp it to [0, 0]. The vec4 backend always writes zeros to the VUE header, so it doesn't suffer from this problem. With vec4-style HWord writes, we can write the header and position together in a single message. In the FS world, we would need 4 extra MOVs of 0 and a longer message, or a separate OWord write. It's likely cheaper to just clamp the value. Fixes DiRT Showdown and Bioshock Infinite, which only rendered half of the screen - the lower left of two triangles. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93054 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-09 15:31:27 -07:00
Jordan Justen	e74812dbfe	i965/hsw: Fix brw_store_data_imm* For Gen6 through Haswell dword 1 is MBZ. In gen 8 it becomes part of the 64-bit address. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-09 15:05:08 -07:00
Kenneth Graunke	96d43f2d08	i965: Reimplement ARB_transform_feedback2 on Haswell and later. My old implementation accumulated <start, end> pairs in a buffer, and eventually processed that data on the CPU. This meant flushing the batchbuffer and waiting for it to completely execute before we could map it, resulting in really long stalls. We could also run out of space in the buffer, and have to do this early. Instead, we can use Haswell's MI_MATH command to do the (end - start) subtraction, as well as the multiplication by 2 or 3 to convert from the number of primitives written to the number of vertices written. We still need to CS stall to read the counters, but otherwise everything is completely pipelined - there's no CPU<->GPU synchronization required. It also uses only 80 bytes in the buffer, no matter what. Improves performance in Manhattan on Skylake GT3e at 800x600 by 6.1086% +/- 0.954166% (n=9). At 1920x1080, improves performance by 2.82103% +/- 0.148596% (n=84). v2: Fix number of primitives -> number of vertices calculation for GL_TRIANGLES (I was multiplying by 4 instead of 3.) Caught by Jordan Justen. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-09 15:00:01 -07:00
Kenneth Graunke	fdb6c1887f	i965: Add a brw_load_register_reg64 helper. It appears that we can't do this in a single command (like we do for MI_LOAD_REGISTER_IMM) - the Skylake simulator gets rather grumpy about the command length if I try to combine them. No matter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-09 15:00:01 -07:00
Kenneth Graunke	4c71c8a74a	i965: Only enable ARB_query_buffer_object for newer kernels on Haswell. On Haswell, we need version 6 of the kernel command parser in order to write the math registers. Our implementation of ARB_query_buffer_object heavily relies on MI_MATH, so we should only advertise it when MI_MATH is available. We also need MI_LOAD_REGISTER_REG, which requires version 7 of the command parser. To make these checks easier, introduce a screen->has_mi_math_and_lrr flag that will be set when both commands are supported. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-09 14:59:58 -07:00
Dave Airlie	2d41eb313f	mesa/objectlabel: don't return info on genned but never bound textures. This fixes some cases in the CTS KHR debug tests where it uses glIsTexture to find an invalid ID and then call GetObjectLabel. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 06:06:09 +10:00
Dave Airlie	bbc6a27590	mesa: don't use genned but unnamed xfb objects. If we try to draw or query an XFB object that hasn't been bound, we shouldn't return any information. This fixes a couple if cases in: GL33-CTS.transform_feedback.api_errors_test The ObjectLabel test is inspired by another test. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 06:06:09 +10:00
Samuel Pitoiset	eafe3905d9	nv50/ir: silence unsupported TGSI_PROPERTY_CS_FIXED_BLOCK_* We don't need them for compute shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-09 21:58:56 +02:00
Jordan Justen	2e2aa992ff	mesa/compute: Fix indirect dispatch buffer size check on 32-bit systems `2655265fcb`, but for compute. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-09 11:16:39 -07:00
Rob Clark	57763ee735	freedreno/ir3: fix fallout from new block iterators Since this is potentially modifying the block structure of the shader, it needs the _safe() version of the iterator. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 13:52:29 -04:00
Nicolai Hähnle	fe102f7677	radeonsi: workaround for tesselation on SI We request more than 32KB of LDS here, which SI doesn't have. Since LLVM recently started checking the size of declared LDS allocations, all shaders involved in tesselation fail to compile on SI. Note that the entire calculation here seems wrong, given how we calculate indices for generic attributes, so the number ends up wrong on CI+ as well. A proper solution is clearly needed, but this patch should serve as a band-aid for SI in the meantime. Also note that the real size of the LDS allocation in hardware is independent from what we tell LLVM, so this is really more of a "cosmetic" change. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95198 Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Nicolai Hähnle	d8f3e8e626	radeonsi: always allocate export memory for pixel shaders Experiments with framebuffer-no-attachments type draw calls have shown that NULL exports stall terribly unless we ensure that export memory is allocated by the SPI. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Nicolai Hähnle	ad1782cfb5	radeonsi: expose performance counters as 64 bit This is useful for shader-related counters, since they tend to quickly exceed 32 bits. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Rob Clark	f096096b77	nir/search: fix typo Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 12:46:24 -04:00
Tim Rowley	b65f7ec450	gallium: enable intel jitevents profiling LLVM when configured with "intel jitevents" enabled can inform VTune about dynamic code, so individual shaders are attributed profiling data and the resulting assembly can be examined. Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-05-09 11:25:02 -05:00
Bruce Cherniak	0062c5f09b	swr: Add missing break in query switch statement. Missed a switch break in query stat collection when refactoring queries. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-05-09 11:21:47 -05:00
Rob Clark	f33083a216	freedreno/ir3: allow for additional VS sysval inputs There are a total of four possible currently, rather than 2. So we need to be prepared for the input array to grow by 16 components. We could get away with less if we could pack sysval inputs.. and the way this is handled currently isn't really the nicest thing. But it's a tactical fix for an issue hit in: GL31-CTS.gtf30.GL3Tests.transform_feedback.transform_feedback_vertex_id Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 11:51:59 -04:00
Emil Velikov	a0d9279e3b	docs: add news item and link release notes for 11.1.4/11.2.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:28:20 +01:00
Emil Velikov	0c5752b672	docs: add sha256 checksums for 11.2.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:08 +01:00
Emil Velikov	f746aa348e	docs: add release notes for 11.2.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:07 +01:00
Emil Velikov	596c881162	docs: add sha256 checksums for 11.1.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:04 +01:00
Emil Velikov	f93d8a885c	docs: add release notes for 11.1.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:02 +01:00
Jose Fonseca	c521f2d737	scons: Improve Python module dependency discovery. Several NIR scripts were using `from ... import ...` syntax, which wasn't supported. Using Python standard libary's modulefinder solves the problem with less effort and hacks. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-09 14:19:24 +01:00
Marek Olšák	172bfdaa9e	r300g: add support for PIPE_FORMAT_x8R8G8B8_* And set endian swap for packed formats the way it should be done in theory. This allows big endian to work again, but it can still be buggy. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789 Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-09 13:11:40 +02:00
Daniel Stone	e54b2e902a	Revert "i965: Always use Y-tiled buffers on SKL+" This commit broke Weston, Mutter, and xf86-video-modesetting, on KMS. In order to use Y-tiled buffers, the kernel requires the tiling mode to be explicitly named through the I915_FORMAT_MOD_Y_TILED AddFB2 modifier; it disallows any attempt to infer the buffer's tiling mode. As the GBM API does not have a way to extract modifiers for a buffer, this commit broke all users of GBM on SKL+. Revert it for now, until we get a way to extract modifier information from GBM, and also let GBM users inform the implementation that it intends to use the modifiers. This reverts commit `6a0d036483`. Signed-off-by: Daniel Stone <daniels@collabora.com> Acked-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Hans de Goede <hdegoede@redhat.com>	2016-05-09 10:35:55 +01:00
Dave Airlie	920d78a32c	mesa/shader_query: add missing subroutines cases ARRAY_SIZE and LOCATION should accept the SUBROUTINE_UNIFORM types. Fixes: GL43-CTS.program_interface_query.subroutines-vertex GL43-CTS.program_interface_query.subroutines-tess-control GL43-CTS.program_interface_query.subroutines-tess-eval GL43-CTS.program_interface_query.subroutines-geometry GL43-CTS.program_interface_query.subroutines-fragment GL43-CTS.program_interface_query.subroutines-compute Reviewed-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-09 06:30:52 +10:00
Kenneth Graunke	742bc53d04	spirv: Fix structure splitting with per-vertex interface arrays. We want to use interface_type, not vtn_var->type. They're normally equivalent, but for geometry/tessellation per-vertex interface arrays, we need to unwrap a level. Otherwise, we tried to iterate a structure members but instead used an array length. If the array length was longer than the number of fields in the structure, we'd crash. Fixes the CreatePipelineGeometryInputBlockPositive layer validation test. v2: Just use glsl_without_array() on the vtn_var type (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-07 15:44:41 -07:00
Kenneth Graunke	1896682d27	compiler: Add a C wrapper for glsl_type::without_array(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-07 15:44:41 -07:00
Nicolai Hähnle	b9e6e8e7d4	radeonsi: fix undefined behavior (memcpy arguments must be non-NULL) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	146927ce7b	radeonsi: fix some reported undefined left-shifts One of these is an unsigned bitfield, which I suspect is a false positive, but gcc 5.3.1 complains about it with -fsanitize=undefined. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	60d2fc233b	gallium/radeon: clean left-shift undefined behavior Shifting into the sign bit of a signed int is undefined behavior. Unfortunately, there are potentially many places where this happens using the register macros. This commit is the result of running sed -ie "s/((($\w\+$) & 0x$\w\+$) << $\w\+$)/(((unsigned)(\1) \& 0x\2) << \3)/g" on all header files in gallium/{r600,radeon,radeonsi}. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	62b7958cd0	gallium: fix various undefined left shifts into sign bit Funnily enough, some of these were turned into a compile-time error by gcc with -fsanitize=undefined ("initializer is not a constant"). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	945c6887ab	compiler/glsl: do not downcast list sentinel This crashes gcc's undefined behaviour sanitizer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:58 -05:00
Nicolai Hähnle	bdad1393a0	mesa/main: fix another undefined left shift Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:45:04 -05:00
Nicolai Hähnle	3e1cf8bf3f	mesa/main: define _NEW_xxx flags as unsigned shifts Since 1 << 31 complains about undefined behaviour; the others are changed only for consistency. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:44:33 -05:00
Bas Nieuwenhuizen	6291f19f71	radeonsi: Compute correct LDS size for fragment shaders. No sure where the 36 came from, but we clearly need at least 48 bytes per attribute per primitive. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-06 21:40:17 +02:00
Eric Anholt	a1f698881e	vc4: Add support for loading immediate values in QIR. This will be used for resetting the uniform stream in the presence of branching, but may also be useful as an optimization to reduce how many uniforms we have to copy out per draw call (in exchange for increasing icache pressure).	2016-05-06 10:25:55 -07:00
Eric Anholt	890dc19eeb	vc4: Make vc4_qpu_validate() produce more verbose failures. Seeing the expansion of a QPU_GET_FIELD in an assert isn't very informative, and it's hard find what's going wrong without getting a dump of the instruction that failed.	2016-05-06 10:25:55 -07:00
Eric Anholt	8e2d0843c0	vc4: Add a small QIR validate pass. This has caught a couple of bugs during loop development so far, and I should probably have written it long ago.	2016-05-06 10:25:55 -07:00
Eric Anholt	daaa9d579d	vc4: Fix the src count on exp2/log2. Found by the upcoming QIR validate pass.	2016-05-06 10:25:55 -07:00
Eric Anholt	d36b28402f	vc4: Reuse QPU disasm's cond flags in QIR. In the process, this made me flatten out the "%s%s%s%s" fprintf arguments.	2016-05-06 10:25:55 -07:00
Eric Anholt	419fee92ee	vc4: When emitting an instruction to an existing temp, mark it non-SSA. Prevents a bug in the later control-flow support series.	2016-05-06 10:25:55 -07:00
Eric Anholt	1387e722cd	vc4: Make sure that we don't overwrite the signal for PROG_END. We should have already emitted a NOP due to the last instruction being a TLB or VPM write. However, if you disable dead code elimination then you might get dead code at the end, and that dead code might have the signal bits set to something non-default, at which point you die in assertion failure.	2016-05-06 10:25:55 -07:00
Samuel Pitoiset	44de03b0f8	nvc0: unreference images when the context is destroyed Like other resources, we need to unreference all images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-06 15:15:32 +02:00
Jose Fonseca	8ae78f7d28	nir: Remove spurious return from void function. Left over from `450c061362`. Trivial. Built locally with clang and gcc. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95296	2016-05-06 12:03:34 +01:00
Marek Olšák	901f57dff5	radeonsi: set DECOMPRESS_Z_ON_FLUSH if nr_samples >= 4 Vulkan always sets this. It only affects in-place Z decompression. This is recommended for performance, but what app uses MSAA depth texturing? Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-06 12:56:47 +02:00
Marek Olšák	4489d75a58	r600g: use the hw MSAA resolving if formats are compatible This allows resolving RGBA into RGBX. This should improve HL2 Lost Coast performance. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-06 12:56:47 +02:00
Kenneth Graunke	bd326c229c	Revert "i965: Switch to scalar TCS by default." This reverts commit `b593737ed8`. Apparently it causes GPU hangs on some image load store tests. Let's turn it back off until we figure out why.	2016-05-05 18:03:23 -07:00
Leo Liu	fef0e993a1	st/omx/enc: fix incorrect reference picture order for B frames Stacking frames is for driver that's capable to do dual instances encoding. Such feature is not enabled for B frames currently. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-05 19:26:43 -04:00
Jason Ekstrand	7bc987abe0	i965/fs: Move handling of samples_identical into the switch statement This is where we handle texop_texture_samples so it makes things more consistent.	2016-05-05 16:25:21 -07:00
Jason Ekstrand	3ba228f997	i965/fs: Simplify texture destination fixups There are a few different fixups that we have to do for texture destinations that re-arrange channels, fix hardware vs. API mismatches, or just shrink the result to fit in the NIR destination. These were all being done in a somewhat haphazard manner. This commit replaces all of the shuffling with a single LOAD_PAYLOAD operation at the end and makes it much easier to insert fixups between the texture instruction itself and the LOAD_PAYLOAD. Shader-db results on Haswell: total instructions in shared programs: 6227035 -> 6226669 (-0.01%) instructions in affected programs: 19119 -> 18753 (-1.91%) helped: 85 HURT: 0 total cycles in shared programs: 56491626 -> 56476126 (-0.03%) cycles in affected programs: 672420 -> 656920 (-2.31%) helped: 92 HURT: 42	2016-05-05 16:25:21 -07:00
Jason Ekstrand	7de0ae634e	i965/fs: stop inclinding glsl/ir.h in brw_fs.h We are no longer using anything from GLSL IR in the FS backend.	2016-05-05 16:25:21 -07:00
Jason Ekstrand	a815499294	i965/fs: Merge nir_emit_texture and emit_texture The fs_visitor::emit_texture helper originated when we still had both NIR and IR visitors for the FS backend. Since the old visitor was removed, emit_texture serves no real purpose beyond arbitrarily splitting heavily-linked code across two functions.	2016-05-05 16:25:21 -07:00
Connor Abbott	4fab8dd5ea	nir: remove now-unused nir_foreach_block*_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:42 -07:00
Connor Abbott	7c36f9eb52	vc4: fixup for new nir_foreach_block() Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-05 16:19:41 -07:00
Connor Abbott	582815d9ea	ir3: fixup for new nir_foreach_block()	2016-05-05 16:19:41 -07:00
Jason Ekstrand	31fc4a2528	nir/lower_double_ops: fixup for new nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Jason Ekstrand	450c061362	nir/lower_double_pack: fixup for new nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Jason Ekstrand	8c807cc2a6	nir/gather_info: fixup for new foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Connor Abbott	331b9f73a2	nir/lower_two_sided_color: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Connor Abbott	d40fbbc27e	nir/lower_tex: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Connor Abbott	8a7fe634d2	nir/lower_outputs_to_temporaries: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Kenneth Graunke	b593737ed8	i965: Switch to scalar TCS by default. Normally, we expect SIMD8 shaders to be more instructions than SIMD4x2 shaders, as it takes four instructions to operate on a vec4, rather than a single instruction. However, the benefit is that it can process 8 objects per shader thread instead of 2. Surprisingly, the shader-db statistics show an improvement in both instruction and cycle counts: Synmark: -31.25% instructions, -29.27% cycles, 0 hurt. Tessmark: -36.92% instructions, -37.81% cycles, 0 hurt. Unigine Heaven: -3.42% instructions, -17.95% cycles, 0 hurt. Shadow of Mordor: +13.24% instructions (26 with fewer instructions, 45 with more), -5.23% cycles (44 with fewer cycles, 27 with more cycles). Presumably, this is because the SIMD8 URB messages are a much more natural fit than the SIMD4x2 URB messages - there's a ton less header setup. I benchmarked Shadow of Mordor and Unigine Heaven on my Skylake GT3e, and the performance seems to be the same or increase ever so slightly (< 1 FPS difference). So I believe it's strictly superior. There's also a lot more optimization potential we can do in scalar mode. This will also help us finish fp64 support, as scalar support is going to land much sooner than vec4-mode support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	bc0062c54a	nir: Optimize out stores of undefs. There are a couple of cycle count changes in shader-db, but it's basically a wash. However, with the Broadwell scalar TCS backend enabled, many Shadow of Mordor shaders benefit from this patch. Because we don't batch up output writes for TCS, vec4 outputs might not have all components defined. Many output writes have a value of undef, which is useless. With scalar TCS, stats for tessellation shaders on Broadwell: total instructions in shared programs: 1283000 -> 1280444 (-0.20%) instructions in affected programs: 34302 -> 31746 (-7.45%) helped: 71 HURT: 0 total cycles in shared programs: 10798768 -> 10780682 (-0.17%) cycles in affected programs: 158004 -> 139918 (-11.45%) helped: 71 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	c7a8b32700	nir: Replace vecN(undef, undef, ...) with a single undef. shader-db statistics on Broadwell: total instructions in shared programs: 8963409 -> 8962455 (-0.01%) instructions in affected programs: 60858 -> 59904 (-1.57%) helped: 318 HURT: 0 total cycles in shared programs: 71408022 -> 71406276 (-0.00%) cycles in affected programs: 398416 -> 396670 (-0.44%) helped: 199 HURT: 51 GAINED: 1 The only shaders affected were in Dota 2 Reborn. It also sets up for the next optimization. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	49ea7454a1	nir: Rename opt_undef_alu to opt_undef_csel; update comments. This better reflects what it does. I plan to add other ALU optimizations as well, so the old name would be confusing. In preparation for that, also move the file comments about csels above the opt_undef_csel function, and delete the ones about there not being other optimizations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	a808ba5965	i965: Rework passthrough TCS checks. According to Timothy, using program_string_id == 0 to identify the passthrough TCS is going to be problematic for his shader cache work. So, change it to strcmp() the name at visitor creation time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Tim Rowley	ff8c0c9a35	swr: [rasterizer core] Faster modulo operator in ProcessVerts Avoid % operator, since we know that curVertex is always incrementing. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:50:11 -05:00
Tim Rowley	2be7c3e780	swr: [rasterizer] Small warning cleanup Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:50:03 -05:00
Tim Rowley	b39c530f88	swr: [rasterizer] Add SWR_ASSUME / SWR_ASSUME_ASSERT macros Fix static code analysis errors found by coverity on Linux Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:56 -05:00
Tim Rowley	db084f48eb	swr: [rasterizer] Miscellaneous backend changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:48 -05:00
Tim Rowley	3951a2109e	swr: [rasterizer] Add support for X24_TYPELESS_G8_UINT format Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:42 -05:00
Tim Rowley	909aee07f8	swr: [rasterizer jitter] Fix printing bugs for tracing. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:29 -05:00
Tim Rowley	bc084e6b3d	swr: [rasterizer memory] Add missing store tiles function Storing color hot tile to 8bit w-major stencil format. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:23 -05:00
Tim Rowley	5332c9d931	swr: [rasterizer jitter] Add asserts for supported formats in fetch shader Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:18 -05:00
Tim Rowley	6e89227054	swr: [rasterizer core] Fix thread allocation Fix windows in 32-bit mode when hyperthreading is disabled on Xeons. Some support for asymmetric processor topologies. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:11 -05:00
Tim Rowley	c2f5d2daa8	swr: [rasterizer core] Fix threadviz support in buckets Need to do lazy eval of the threadviz knob since order of globals is undefined. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:04 -05:00
Tim Rowley	1eb211c4a4	swr: [rasterizer] Whitespace cleanup and misc changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:48:55 -05:00
Nicolai Hähnle	d97e333ea4	radeonsi: mark descriptor loads as using dynamically uniform indices This tells LLVM to always use SMEM loads for descriptors. It fixes a regression in piglit's arb_shader_storage_buffer_object/execution/indirect.shader_test that was caused by LLVM r268259 (but the proper fix is really here in Mesa). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-05 12:21:40 -05:00
Matt Turner	f01d92f473	i965/fs: Don't follow pow with an instruction with two dest regs. Beginning with commit `7b208a73`, Unigine Valley began hanging the GPU on Gen >= 8 platforms. Evidently that commit allowed the scheduler to make different choices that somehow finally ran afoul of a hardware bug in which POW and FDIV instructions may not be followed by an instruction with two destination registers (including compressed instructions). I presume the conditions are more complex than that, but the internal hardware bug report (BDWGFX bug_de 1696294) does not contain much more information. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94924 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> [v1] Tested-by: Mark Janes <mark.a.janes@intel.com> [v1] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-05 10:18:28 -07:00
Bruce Cherniak	9d86a5eea7	swr: Remove stall waiting for core query counters. When gathering query results, swr_gather_stats was unnecessarily stalling the entire pipeline. Results are now collected asynchronously, with a fence marking completion. Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2016-05-05 10:50:09 -05:00
Dave Airlie	76a36ac3ea	mesa/ubo: add missing compute cases for ubo/atomic buffers This fixes: GL43-CTS.compute_shader.resource-ubo Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-05 20:29:02 +10:00
Dave Airlie	2dd3fc3cac	mesa/compute: drop pointless casts. We already are a GLintptr, casting won't help. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-05 20:28:41 +10:00
Thomas Hindoe Paaboel Andersen	76a423efe0	mesa: remove null check before free Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-05 09:50:38 +02:00
Thomas Hindoe Paaboel Andersen	3a6763f0a0	freedreno: remove null check before free Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-05 09:34:01 +02:00
Thomas Hindoe Paaboel Andersen	8698194313	nir: fix assert for wildcard pairs The assert was null checking dest_arr_parent twice. The intention seems to be to check both dest_ and src_. Added in `d3636da9` Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-05 09:33:02 +02:00
Brian Paul	be5010c4b8	glapi: fix parameter type for GetSamplerParameterIuivEXT() in es_EXT.xml The function returns GLuint, not GLfloat values. v2: also fix the OES function Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-04 14:49:39 -06:00
Brian Paul	54d203a319	mesa: include texture format in glGenerateMipmap error message Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-04 14:49:39 -06:00
Brian Paul	a62f031bc3	main: uses casts to silence some _mesa_debug() format warnings Silences warnings with 32-bit Linux gcc builds and MinGW which doesn't recognize the ‘t’ conversion character. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-05-04 14:49:39 -06:00
Jordan Justen	51300a0387	docs: Mark GL_ARB_query_buffer_object as done for i965/hsw+ Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-04 11:23:17 -07:00
Jordan Justen	f00c399bae	i965: Implement ARB_query_buffer_object for HSW+ v2: * Declare loop index variable at loop site (idr) * Make arrays of MI_MATH instructions 'static const' (idr) * Remove commented debug code (idr) * Updated comment in set_query_availability (Ken) * Replace switch with if/else in hsw_result_to_gpr0 (Ken) * Only divide GL_FRAGMENT_SHADER_INVOCATIONS_ARB by 4 on hsw and gen8 (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-04 11:23:17 -07:00
Jordan Justen	357ff91359	i965/gen6+: Add load register immediate helper functions Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:17 -07:00
Jordan Justen	959e1e9e66	i965/hsw+: Add support for copying a register Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:17 -07:00
Jordan Justen	aad14a22cb	i965/gen6+: Add support for storing immediate data into a buffer Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:17 -07:00
Jordan Justen	ac0bbf9ef3	i965: Add MI_MATH reg defs for HSW+ Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:17 -07:00
Jordan Justen	9f581f8f24	i965: Add brw_store_register_mem32 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:17 -07:00
Jordan Justen	c54e5c2fb2	i965: Use offset instead of index in brw_store_register_mem64 This matches the byte based offset of brw_load_register_mem. The function is also moved into intel_batchbuffer.c like brw_load_register_mem. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:10 -07:00
Jan Vesely	77959ce07b	r600,compute: create vtx buffer for text + rodata Reserve buffer id 2 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-04 13:09:18 -04:00
Rob Clark	2e117a7649	freedreno: allow ctx->draw_vbo to fail Pretty much only happens if shader variant compile fails. But in this case, if we haven't emitted cmdstream, we don't want to set needs_flush. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	291ac872a4	freedreno: move shader-stage dirty bits to global dirty flag This was always a bit overly complicated, and had some issues (like ctx->prog.dirty not getting reset at the end of the batch). It also required some special hacks to avoid resetting dirty state on binning pass. So just move it all into ctx->dirty (leaving some free bits for future shader stages), and make FD_DIRTY_PROG just be the union of all FD_SHADER_DIRTY_*. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	a48cccacf3	freedreno/a4xx: fix bogus offset for f32x24s8 stencil restore fixes: $piglit/bin/fbo-clear-formats GL_ARB_depth_buffer_float Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	e7c64041e9	freedreno: add some debug_asserts() to catch insane offsets Ofc won't catch all faults, but at least helpful for catching offsets which are completely bogus. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	1f2bc64f31	freedreno/a4xx: deal with VS which do not write position Fixes $piglit/bin/glsl-1.40-tf-no-position a3xx may need similar? Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	a6ad30202c	freedreno/ir3: remove a couple redundant is_flow()s Now that the opc's encode the instruction category (making them unique) we no longer need to check the category in addition to the opc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	f0a1f3de27	freedreno/ir3: cp small negative integers too Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	1f04d4bf59	freedreno/ir3: fix # of registers The instruction encoding allows for more registers, but at least on a3xx/a4xx they don't actually exist. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	173871dfb9	freedreno/ir3: lower immeds to const Helps reduce register pressure and instruction counts for immediates that would otherwise require a mov into gpr. total instructions in shared programs: 4455332 -> 4369297 (-1.93%) total dwords in shared programs: 8807872 -> 8614432 (-2.20%) total full registers used in shared programs: 263062 -> 250846 (-4.64%) total half registers used in shader programs: 9845 -> 9845 (0.00%) total const registers used in shared programs: 1029735 -> 1466993 (42.46%) half full const instr dwords helped 0 10415 0 17861 5912 hurt 0 1157 21458 947 33 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	b15c7fc268	freedreno/ir3: add ir3_cp_ctx Needed in next commit.. just split out to reduce noise. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	b9985e5bde	add REVIEWERS and get_reviewer.pl script Copied from linux kernel (where it is called MAINTAINERS and get_maintainer.pl), with minimal changes to script (to recognize mesa src tree rather than linux kernel src tree, and to avoid accidentaly CC'ing Linus Torvalds on mesa patches), and slimmed down MAINTAINER file syntax to recognize that we don't really have subsystem "maintainers" in the same sense as the linux kernel (ie. no different mailing lists and git trees per subsystem). The main point is to automate slapping on the correct CC's for patches via git's --cc-cmd feature, more than anything else. I didn't attempt to fully populate the REVIEWERS file, by a long shot. This is an opt-in system and anyone else can add their own entries. To utilize: git send-email --cc-cmd ./scripts/get_reviewer.pl ... or to configure it to be the default: git config sendemail.cccmd ./scripts/get_reviewer.pl Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:46 -04:00
Ilia Mirkin	38fcf7cbad	nouveau/video: properly detect the decoder class for availability checks The kernel is now more strict with the class ids it exposes, so we need to check the G98 and MCP89 classes as well as the GT215 class. This effectively caused us to decide there were no decoding capabilities on newer kernel for VP3 chips. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95251 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-05-04 10:45:07 -04:00
Kenneth Graunke	0332963d19	i965: Delete stale perf_debug(). MOCS for 3DSTATE_SO_BUFFER has existed for ages.	2016-05-04 02:29:03 -07:00
Kenneth Graunke	3a886721ed	i965: Silence unused variable warning I added this when deleting some unnecessary code in a rebase.	2016-05-04 00:46:31 -07:00
Juan A. Suarez Romero	97989059b9	mesa/main: handle double uniform matrices properly When computing the offset in the uniform storage table, take into account the size multiplier so double precision matrices are handled correctly. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 08:08:12 +02:00
Samuel Iglesias Gonsálvez	2ab2d2e588	nir: Separate 32 and 64-bit fmod lowering Split 32-bit and 64-bit fmod lowering as the drivers might need to lower them separately inside NIR depending on the HW support. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-04 08:07:49 +02:00
Samuel Iglesias Gonsálvez	b902377a56	nir/lower_double_ops: lower mod() There are rounding errors with the division in i965 that affect the mod(x,y) result when x = N * y. Instead of returning '0' it was returning 'y'. This lowering pass fixes those cases. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-04 08:07:49 +02:00
Matt Turner	9f81434c5f	i965: Define GEN_GE/GEN_LE macros in terms of GEN_LT. GEN_LT has a straightforward implementation on which we can build the GEN_GE and GEN_LE macros. Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:34:01 -07:00
Matt Turner	affaae197f	i965: Add disassembler support for remaining opcodes. For opcodes that changed meaning on different generations, we store a pointer to a secondary table and the table's size in a tagged union in place of the mnemonic and number of sources. Acked-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:34:00 -07:00
Matt Turner	b89b0a03f2	i965: Make opcode_descs and gen_from_devinfo() static. The previous commit replaced direct uses of opcode_descs with calls to the wrapper function, which should be the only method of accessing opcode_descs's data. As a result gen_from_devinfo() can also be made static. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:34:00 -07:00
Matt Turner	0ff4912cf4	i965: Actually check whether the opcode is supported. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:33:59 -07:00
Matt Turner	667408b889	i965: Merge inst_info and opcode_desc tables. I merged opcode_desc into inst_info (instead of the other way around) because inst_info was sorted by opcode number. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:33:42 -07:00
Matt Turner	d01596613b	i965: Move inst_info from brw_eu_validate.c to brw_eu.c. Drop the uses of 'enum gen' to a plain int, so that we don't have to expose the bitfield definitions and GEN_GE/GEN_LE macros to other users of brw_eu.h. As a result, s/.gen/.gens/ to avoid confusion with devinfo->gen. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:33:42 -07:00
Francisco Jerez	1530e27534	i965/disasm: Wrap opcode_desc look-up in a function. The function takes a device info struct as argument in addition to the opcode number in order to disambiguate between multiple opcode_desc entries for different instructions with the same opcode number. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1] [v2] mattst88: Put brw_opcode_desc() in brw_eu.c instead of moving it there in a later patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v2] [v3] mattst88: Return NULL if opcode >= ARRAY_SIZE(opcode_descs) Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-03 22:32:40 -07:00
Francisco Jerez	1cc7573162	i965: Pass devinfo pointer to is_3src() helpers. This is not strictly required for the following changes because none of the three-source opcodes we support at the moment in the compiler back-end has been removed or redefined, but that's likely to change in the future. In any case having hardware instructions specified as a pair of hardware device and opcode number explicitly in all cases will simplify the opcode look-up interface introduced in a subsequent commit, since the opcode number alone is in general ambiguous. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 18:06:21 -07:00
Francisco Jerez	c55dc77ab1	i965: Pass devinfo pointer to brw_instruction_name(). A future series will implement support for an instruction that happens to have the same opcode number as another instruction we support already on a disjoint set of hardware generations. In order to disambiguate which instruction it is brw_instruction_name() will need some way to find out which device we are generating code for. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 18:06:21 -07:00
Kenneth Graunke	7d9143ad88	i965: Write a scalar TCS backend that runs in SINGLE_PATCH mode. Unlike most shader stages, the Hull Shader hardware makes us explicitly tell it how many threads to dispatch and manually configure the channel mask. One perk of this is that we have a lot of flexibility - we can run it in either SIMD4x2 or SIMD8 mode. Treating it as SIMD8 means that shaders with 8 or fewer output vertices (which is overwhemingly the common case) can be handled by a single thread. This has several intriguing properties: - Accessing input arrays with gl_InvocationID as the index is a simple SIMD8 URB read with g1 as the header. No indirect addressing required. - Barriers are no-ops. - We could potentially do output shadowing to combine writes, as the concurrency concerns are gone. (We don't do this yet, though.) v2: Drop first_non_payload_grf change, as it was always adding 0 (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-03 16:28:00 -07:00
Kenneth Graunke	75881bed9e	i965: Rework the TCS passthrough shader to use NIR. I'm about to implement a scalar TCS backend, and I'd rather not duplicate all of this code there. One change is that we now write the tessellation levels from all TCS threads, rather than just the first. This is pretty harmless, and was easier. The IF/ENDIF needed for that are gone; otherwise the generated code is basically identical. I chose to emit load/store intrinsics directly because it was easier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-03 16:27:52 -07:00
Brian Paul	ef5a31fc06	gallium/util: change assertion to conditional in util_bitmask_destroy() If we fail to create a context in the VMware driver we call this function unconditionally to free a bunch of bit vectors. Instead of asserting on a null pointer, just no-op. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-03 15:40:49 -06:00
Brian Paul	68116dcd5a	cso: null-out previously bound sampler states If, for example, we previously had 2 sampler states bound and now we are binding one, we'd leave the second sampler state unchanged. This change nulls-out the second sampler state in this situation. We're already doing the same thing for sampler views. This silences an occasional warning issued by the VMware driver when the number of sampler views and sampler states disagreed. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-03 15:40:49 -06:00
Brian Paul	05abaa65c7	svga: try to flag surfaces for sampling, in addition to rendering This silences some warnings when we try to sample from surfaces that were created for drawing, such as when blitting from one of the framebuffer surfaces. We were already doing the opposite situation (adding a bind flag for rendering to surfaces declared as texture sources). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	abc6432d54	svga: fix copying non-zero layers of 1D array textures Like cube maps, we need to convert the z information to a layer index. Also rename the _face vars to _face_layer to make things a little more understandable. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	b94f73c150	svga: clean up svga_pipe_blit.c Remove dead code. Fix formatting. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	8842be1132	rbug: s/Elements/ARRAY_SIZE/ Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	7f641916bf	freedreno: s/Elements/ARRAY_SIZE/ Signed-off-by: Brian Paul <brianp@vmware.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-03 15:40:48 -06:00
Brian Paul	b91975714d	trace: s/Elements/ARRAY_SIZE/ Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	e193c5dd59	ilo: s/Elements/ARRAY_SIZE/ Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	951bf8b4a6	i915g: s/Elements/ARRAY_SIZE/ Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-03 15:40:48 -06:00
Samuel Pitoiset	5658ddc7fe	nvc0: compute a percentage for metric-achieved_occupancy metric-issue_slot_utilization and metric-branch_efficiency are already computed as percentages. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-03 23:18:50 +02:00
Samuel Pitoiset	10ec27760a	nvc0: display some performance metrics with a percentage This makes more sense for them. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-03 23:18:50 +02:00
Samuel Pitoiset	64937615a0	nvc0: store the driver query type for performance metrics This will allow to use percentages for some metrics because the Gallium HUD doesn't allow to display floating point numbers and 0 is printed instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-03 23:18:50 +02:00
Samuel Pitoiset	a9bc3211f5	nvc0: fix exposing of metric-issue_slots for SM21/SM30 This is most likely a copy-paste error when I reworked this area few weeks ago. For SM20, metric-issue_slots is equal to inst_issued because there is only one pipeline, so the metric is not exposed there. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reported-by: Karol Herbst <nouveau@karolherbst.de>	2016-05-03 23:18:50 +02:00
Mark Janes	0af8a7d50c	mesa/objectlabel: handle NULL src string This prevents a crash when a NULL src is passed with a non-NULL length. fixes: dEQP-GLES31.functional.debug.object_labels.query_length_only Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95252 Signed-off-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-03 14:07:31 -07:00
Dave Airlie	265fe9dce8	glsl: subroutine types cannot be used in constructors. This fixes two of the cases in GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-04 06:44:45 +10:00
Dave Airlie	3110a0aa23	glsl: resource is a reserved keyword in GLSL 4.20 as well resource just appears in GLSL 4.20 without any fanfare. Fixes GL43-CTX.CommonBugs.CommonBug_ReservedNames Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-04 06:44:45 +10:00
Jan Vesely	ebbe31d57c	gallium,utils: Fix trivial sign compare warnings Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-03 12:00:09 -04:00
Knut Andre Tidemann	c68a9cdaac	anv: fix hang during generation of dev_icd.json. Fixes: `b370ec7c76` ("anv: tweak the %.json rule") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-03 11:42:47 +01:00
Anuj Phogat	883f3662db	swrast: Add texfetch_funcs entries for astc 3d formats Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	63432eb370	mesa: Enable translation between astc 3d gl formats and mesa formats Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	54cac7ad96	mesa: Handle astc 3d formats in _mesa_get_compressed_formats() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	dcfea1d7eb	mesa: Handle astc 3d formats in _mesa_base_tex_format() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	cf85ef1618	mesa: Account for astc 3d formats in _mesa_is_astc_format() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	38cd8145a8	mesa: Add a helper function is_astc_3d_format() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	72dfe0242d	mesa: Add the missing defines for GL_OES_texture_compression_astc Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	57451e0fc1	mesa: Align the values of #define's in glheader.h Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	0306110fa9	mesa: Add OES_texture_compression_astc to extension table and gl_extensions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	059f36c671	mesa: Add entries for astc 3d formats initializing struct gl_format_info Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	705216dbed	mesa: Add mesa formats for astc 3d formats Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	24bb6ee8b6	glapi: Update dispatch XML files for OES_texture_compression_astc.xml Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	63a7a9d115	mesa: Account for block depth in _mesa_format_image_size() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	87bf66daa9	mesa: Handle 3d block sizes in _mesa_compute_compressed_pixelstore Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	84a44844f2	mesa: Handle 3d block sizes in teximage error checks Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	ec60b3da69	mesa: Handle 3d block sizes in getteximage error checks Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	5713461ae7	mesa: Add an assert for BlockDepth in _mesa_get_format_block_size() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:17 -07:00
Anuj Phogat	9163c37349	mesa: Add a helper function to query 3D block sizes Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:17 -07:00
Anuj Phogat	6abb1b4984	mesa: Add block depth field in struct gl_format_info This will be later required for 3D ASTC formats. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:17 -07:00
Dave Airlie	c4a0cd4662	mesa/copyimage: make sure number of samples match. This fixes GL43-CTS.copy_image.samples_missmatch which otherwise asserts in the radeonsi driver. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-03 20:13:29 +10:00
Dave Airlie	5989a2937f	mesa/objectlabel: don't do memcpy if bufSize is 0 (v2) This prevents GL43-CTS.khr_debug.labels_non_debug from memcpying all over the stack and crashing. v2: actually fix the test. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-03 20:12:59 +10:00
Dave Airlie	30823f997b	mesa/textureview: move error checks up higher GL43-CTS.texture_view.errors checks for GL_INVALID_VALUE here but we catch these problems in the dimensionsOK check and return the wrong error value. This fixes: GL43-CTS.texture_view.errors. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-03 20:12:52 +10:00
Marek Olšák	5541e11b9a	gallium/radeon: remove stencil_tile_split from metadata this is a leftover from the days when depth-stencil buffers were allocated by the DDX Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	20a77397fa	gallium/radeon: remove tile_mode_array_valid flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	c8aac4fc0d	winsys/amdgpu: pass PIPE_CONFIG to addrlib on texture import This hasn't been needed, but I think we should set it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	dc970c4f4e	winsys/amdgpu: read NUM_BANKS from buffer metadata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	02f90cef7d	radeonsi: remove unused tile mode getters Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	b9e3e87069	radeonsi: just read tile mode arrays in SDMA setup Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	0c2cba1ec6	radeonsi: just read tile mode arrays in SI DMA setup Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	c3ca54aee9	radeonsi: just read tile mode arrays in DB setup Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	ef45825708	gallium/radeon: add radeon_surf::macro_tile_index for indexing cik_macrotile_mode_array Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	ed4fd542de	winsys/radeon: drop support for kernels lacking tile mode array queries This will allow us to simplify a lot of code around tiling. Kernel 3.10 is required for SI support. Kernel 3.13 is required for CIK support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	3d956b4bc0	st/mesa: fix blit-based GetTexImage for non-finalized textures This fixes getteximage-depth piglit failures on radeonsi. Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	77af6bcc26	winsys/radeon: count buffer size only once Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	3e3c43418e	winsys/amdgpu: count buffer size only once Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	f98ba4123c	winsys/amdgpu: loosen up requirements for how much memory IBs can use ported from winsys/radeon. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	9ec00c23c2	radeonsi: when parsing dmesg, skip empty lines Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	9983efca76	radeonsi: use the hw MSAA resolving if formats are compatible This allows resolving RGBA into RGBX. This should improve HL2 Lost Coast performance. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Samuel Pitoiset	819836d240	nv50,nvc0: re-bind old compute state after reading MP perf counters This might be useful to avoid breaking the current compute state when monitoring MP perf counters because we use a compute kernel to read out those counters. This has been initially suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-02 22:30:48 +02:00
Rob Clark	dcf8c4425a	nir: make lower_clamp_color pass work after lower i/o Kinda important to work with tgsi_to_nir, which generates nir which already has i/o lowered. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-02 14:25:38 -04:00
Eric Anholt	226bd92945	vc4: Use NIR lowering for sRGB decode. This should get us the same decode code generated, but with a lot less custom code in the driver.	2016-05-02 11:06:29 -07:00
Eric Anholt	4b326341f3	vc4: Just use NIR lowering for texture projection. This means doing Newton-Raphson on the RCP, but it's probably actually a good thing to be accurate on.	2016-05-02 11:06:29 -07:00
Eric Anholt	2f98bc100d	vc4: Scalarize phi nodes as well. This makes fewer programs with loops assertion fail, replacing them with the rendering failure warning.	2016-05-02 11:06:29 -07:00
Eric Anholt	4a2ad8500d	vc4: Add whitespace after each program stage dump. In particular it's been hard to find the point where we switch from dumping pre-optimization QIR and post-optimization QIR.	2016-05-02 11:06:29 -07:00
Eric Anholt	84322b2f31	vc4: Remove the CSE pass. It's not doing anything according to shader-db now that we're using NIR. It would have had to be reworked significantly anyway, to handle control flow.	2016-05-02 11:06:29 -07:00
Eric Anholt	b145b731ab	vc4: Emit only one FRAG_Z or FRAG_W QIR opcode. We were generating piles of FRAG_W for interpolation, only to CSE them away immediately. Since this is the only thing that CSE is doing for us any more, just avoid making the CSE work necessary.	2016-05-02 11:06:29 -07:00
Eric Anholt	e138716d8d	vc4: Use the NIR cubemap normalization instead of our own. This is one of two uses of the current QIR CSE pass according to shader-db. The NIR pass means that we'll end up doing Newton-Raphson on our RCP, which we weren't doing before, but that's probably actually a good thing.	2016-05-02 11:06:29 -07:00
Eric Anholt	3bee7581e6	vc4: Drop the support for DCE of texture instructions. Now that we're using NIR for our optimization, there's no need for this tricky code.	2016-05-02 11:06:29 -07:00
Nicolai Hähnle	155ce49603	radeonsi: fix PIPE_FORMAT_R11G11B10_FLOAT handling That format has first_non_void < 0. This fixes a regression in piglit arb_shader_image_load_store-semantics that was introduced by commit `76b8c5cc60`, while hopefully still shutting Coverity up (and failing in a more obvious way if a similar error should re-appear). Reviewed-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-02 11:38:23 -05:00
Nicolai Hähnle	169ace5636	radeonsi: correct NULL-pointer check in si_upload_const_buffer Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-02 11:37:55 -05:00
Dave Airlie	cf6dadb00b	softpipe: bump 3D texture limit to 2048 The GL4.1 spec bumps this to 2048, so we should do so. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-02 07:29:02 +10:00
Dave Airlie	277170eeea	softpipe: allow r32 xchg on shader images. This is part of OES_shader_image_atomic.txt. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-02 07:28:58 +10:00
Ilia Mirkin	3950aa47df	softpipe: avoid leaking local_mem on machines alloc failure Spotted by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-05-01 11:19:08 -04:00
Ilia Mirkin	ad545d179b	vbo: avoid leaking prim on vbo bind failure Spotted by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-05-01 11:19:08 -04:00
Edward O'Callaghan	23cf24e227	mapi/glapi: Fix dup word typo in glapi_getproc.c Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-01 16:07:29 +02:00
Emil Velikov	44f921091a	isl: automake: don't explicitly EXTRA_DIST the tests folder The file(s) within are already picked thanks to the build rule of the respective test. No need to have the folder in EXTRA_DIST. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 14:17:30 +01:00
Timothy Arceri	f982e2434b	mesa: add LOCATION_COMPONENT support to GetProgramResourceiv From Section 7.3.1.1 (Naming Active Resources) of the OpenGL 4.5 spec: "For the property LOCATION_COMPONENT, a single integer indicating the first component of the location assigned to an active input or output variable is written to params. For input and output variables with a component specified by a layout qualifier, the specified component is written. For all other input and output variables, the value zero is written." Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-01 23:13:36 +10:00
Timothy Arceri	b1c872a81e	glsl: add component to has_layout() helper I don't think this will do much as it's a compiler error to use component without location which is already in the table but its good to be consistent. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-05-01 23:13:28 +10:00
Timothy Arceri	589053dac7	glsl: validate linking of intrastage component qualifiers Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-05-01 23:13:22 +10:00
Timothy Arceri	0317dfcd9b	glsl: update explicit location matching to support component qualifier This is needed so we don't optimise away the varying when more than one shares the same location. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-01 23:13:15 +10:00
Timothy Arceri	0d88b15f07	glsl: cross validate varyings with a component qualifier This change checks for component overlap, including handling overlap of locations and components by doubles. Previously there was no validation for assigning explicit locations to a location used by the second half of a double. V3: simplify handling of doubles and fix double component aliasing detection V2: fix component matching for matricies Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-01 23:13:10 +10:00
Timothy Arceri	94438578d2	glsl: validate and store component layout qualifier in GLSL IR We make use of the existing IR field location_frac used for tracking component locations. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-01 23:13:05 +10:00
Timothy Arceri	2d9936a686	glsl: allow component qualifier on varying inputs Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-05-01 23:13:00 +10:00
Timothy Arceri	daa8df590b	glsl: parse component layout qualifier Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-01 23:12:52 +10:00
WuZhen	ea4c1afd05	android: enable dlopen() on all architectures Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:29 +01:00
Jose Fonseca	5649d6ab06	winsys/sw/xlib: use correct free function for xlib_dt->data Analogous to previous commit. Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:29 +01:00
WuZhen	4f21f3f2e8	winsys/sw/dri: use correct free function for dri_sw_dt->data align_malloc() is used to allocate dri_sw_dt->data, thus we should not be using FREE() but align_free(). Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> [Emil Velikov: tweak commit summary/shortlog] Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-01 12:31:29 +01:00
WuZhen	798f7a8596	tgsi: initialize stack allocated struct Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:29 +01:00
Emil Velikov	fb653641ea	egl: android: do not feed invalid fourcc/pitch into the dri module Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:29 +01:00
Rob Herring	34ddef39ce	egl: android: add dma-buf fd support Add support for creating images from Android native buffers with dma-buf fd. As dma-buf support also requires DRI image loader extension, add that as well. This is based on several originally patches written by Varad Gautam. I've collapsed them into logical changes and done a bit of reformatting. Using dma-bufs vs. GEM handles is now a runtime decision similar to the wayland EGL instead of being compile time selection. The dma-buf support is also re-written to use common dri2_create_image_dma_buf function in egl_dri2.c. Cc: Varad Gautam <varadgautam@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:29 +01:00
Rob Herring	81a6fff4c5	egl: android: factor out back buffer handling code In preparation to use the same code for dma-bufs, factor out the code to a separate function. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:28 +01:00
Rob Herring	dfaccf25f5	egl: android: factor out format conversion code to a function Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:28 +01:00
Rob Herring	d45884ef05	egl: android: disable __DRI_DRI2_LOADER support on render nodes Use of __DRI_DRI2_LOADER extension is only supported for card nodes. In order to support dmabufs, Android will be moving to using render nodes and we need to disable the DRI2 loader extension. This is based on the Wayland EGL code. Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:28 +01:00
Rob Herring	dbbf7a8e61	Android: fix build ordering of subdirectories Different versions of make behave differently in whether $(wildcard) sorts the results or not. The Android build now explicitly sorts all-named-subdir-makefiles which breaks the build because src/gallium must be included after src/mesa/drivers/dri. The Android build system doesn't support doing "include $(call all-named-subdir-makefiles,...)" twice, so rework things by generating the included makefile list and including them in 2 steps. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:28 +01:00
Jamey Sharp	595d56cc86	glShaderSource must not change compile status. OpenGL 4.5 Core Profile section 7.1, in the documentation for CompileShader, says: "Changing the source code of a shader object with ShaderSource does not change its compile status or the compiled shader code." According to Karol Herbst, the game "Divinity: Original Sin - Enhanced Edition" depends on this odd quirk of the spec. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93551 Signed-off-by: Jamey Sharp <jamey@minilop.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-01 18:46:24 +10:00
Emil Velikov	9fa2e57a73	gallium/radeon: nuke the final pre LLVM 3.6 codepath Missed with commit `100796c15c` "gallium/radeon: drop support for LLVM 3.5" v2: s/LLVN/LLVM/ in shortlog (Nicolai) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-01 08:57:32 +01:00
Emil Velikov	7336df06ed	anv: include the files in the tarball Namely the python script, the ICD header and private headers. We could get the system version of the ICD ones, although there is no .pc file to easily locate and/or manage them. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:46 +01:00
Emil Velikov	9e09507516	i965: don't forget to ship brw_nir_trig_workarounds.py Otherwise we won't be able to regenerate the source file(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:46 +01:00
Emil Velikov	1f04caa09c	isl: include all the files in the tarball Add the missing header(s), generation scripts, README ... Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:34 +01:00
Emil Velikov	cee69ccb92	spirv: automake: add missing headers to the tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:06 +01:00
Emil Velikov	dc38e6b169	automake: wire up the intel vulkan driver to make distcheck Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:06 +01:00
Emil Velikov	dfbf1289a4	anv: update .gitignore Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	fcdcb829d8	anv: automake: remove no longer needed include Thanks to last commit we can nuke it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	3285461ceb	anv: automake: tweak anv_entrypoint.[ch] rule Rather than using cat + cpp feed the file(s) directly into the latter. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:05 +01:00
Emil Velikov	bc7802098e	anv: tweak libvulkan_intel.so link libraries i.e do not use -lfoo directly. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	9f235adf99	anv: cosmetic makefile changes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:05 +01:00
Emil Velikov	446234033d	anv: place the builddir includes before the srcdir ones Otherwise we risk picking the possibly outdated file in the source dir over the fresh one in the builddir. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:05 +01:00
Emil Velikov	6cb814727d	automake: tweak SUBDIR reorder and comment it It should ease people with all the interaction and platforms and how they interact (at least from a build POV) with each other. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:05 +01:00
Emil Velikov	4fcf0ba113	configure.ac: remove unused HAVE_EGL_PLATFORM_NULL conditional Afaict the last user was based on st/egl. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	9f3588eb37	automake: drop "EGL_" from HAVE_EGL_PLATFORM_WAYLAND Analogous to previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	5459db91e3	automake: drop "EGL_" from HAVE_EGL_PLATFORM_X11 The variable covers more than just EGL, let's try to untangle the confusion it brings. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	a56009d089	anv: get rid of VULKAN_ENTRYPOINT_CPPFLAGS variable Add the missing include to AM_CPPFLAGS and use it throughout the makefile. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:05 +01:00
Emil Velikov	6dc169e18f	anv: factor out the X11/XCB build Similar to earlier commit - move all the common bits into a single place, thus improving readability and allowing us to see what's missing. Also don't forget to add the missing bits. This commit should allows us to build wayland only vulkan ;-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	cbc4837b83	anv: kill of custom define HAVE_WAYLAND_PLATFORM Vulkan API already has equivalent, so simplify things as just use it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	9bc99f5668	anv: refactor wayland build handling Rather than having things split out in multiple places, consolidate it and add all the missing bits. Also ensure that we use the already built static library libwayland-drm.la. v2 Add missing '\' in the CFLAGS. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-05-01 08:38:04 +01:00
Emil Velikov	3a2d09dd65	automake: include vulkan subdir after wayland-drm We'll reuse the existing wayland-drm static library with next commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:04 +01:00
Emil Velikov	fe918556a2	anv: use a common variable to manage the library dependencies Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	82d0b59f02	anv: use the GENERATED_FILES variable ... rather than having duplicates files through the sources lists. Splitting things as is, has the side effect of making things clearer and easing a potential android build. The latter of which automatically adds BUILT_SOURCES to the binary. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	3ee7d8b0eb	anv: fold the tests' makefile Recent commit removed the winsys defines from anv_private.h thus breaking the tests. To fix that and avoid it in the future, merge the tests makefile in the libvulkan one. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	f3cb0dcae1	anv: build the core vulkan only once Introduce a static library libvulkan_common.la that is used by libvukan_intel.la and libvulkan_test.la. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	21800d77ff	anv: kill off custom CFLAGS AM_CFLAGS already does all that we need. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	623cb3a598	anv: add missing link against the math library Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	e98cf60446	anv: split sources lists to Makefile.sources Will allow others to reuse the lists (scons/android anyone ?) and makes the file a lot shorter and easier to read. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	0d3e7b17c9	anv: remove custom rule to install the intel_icd.json Autoconf already does the exact same thing as the manually written rule. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94969 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	30e6f68b3b	anv: tweak the LDFLAGS Copy/paste from the rest of mesa, but namely. - The module should be shared only. - We don't need the explicit ".so", as the vulkan loader will retrieve the full filename from the json - No unresolved symbols in the final binary - Use the linker garbage collector to slim down the final binary. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:03 +01:00
Emil Velikov	b370ec7c76	anv: tweak the %.json rule It's used only by dev_icd.json so just call it that way. While we're here, manually expand $< (as it might cause issue on some systems) and drop the unneeded install_libdir substitution. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:03 +01:00
Emil Velikov	abd360ab75	anv: add a comment about dev_icd.json Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:03 +01:00
Emil Velikov	44978a91ff	genxml: ship all the files needed in the tarball v2: The xml files are not called "gen*_pack.xml" (Jason) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:03 +01:00
Emil Velikov	3f23a0f8c1	anv: remove description about GENX_FUNC macro The macro has been gone since commit `1f1cf6fcb0` "anv: Get rid of GENX_FUNC" Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:37:25 +01:00
Emil Velikov	0700cdd5aa	gallium/target-helpers: remove inline_wrapper_sw_helper.h Unused as of commit `dddedbec0e` "{st,targets}/nine: use static/dynamic pipe-loader" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:37:25 +01:00
Mark Kettenis	b8e59292e6	egl/x11: resolve "initialization from incompatible pointer type" warning With earlier commit we've moved a few functions and changing the argument type from _EGLDisplay * to struct dri2_egl_display *. The latter is effectively a wrapper around the former, thus functionality was preserved, although GCC rightfully warned us about the misuse. Add a simple wrapper that casts and propagates the correct type. Fixes: `9bbf3737f9` ("egl/x11: authenticate before doing chipset id ioctls") Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Reported-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:37:25 +01:00
Chuck Atkins	a92910ae37	glx: Refactor the configure options for glx implementation choice (v3) Instead of cascading support for various different implementations of GLX, all three options are now specified through the --enable-glx option: --enable-glx=dri : Enable the DRI-based GLX --enable-glx=xlib : Enable the classic Xlib-based GLX --enable-glx=gallium-xlib : Enable the gallium Xlib-based GLX --enable-glx[=yes] : Defaults to dri if DRI is enabled, else gallium-xlib if gallium is enabled, else xlib This removes the --enable-xlib-glx option and fixes a bug in which both the classic xlib-glx and gallium xlib-glx implementations were getting built causing different versioned and conflicting libGL libraries to be installed. v2: Changes from various review feedback from Emil: a) Fixed typos b) Corrected help docs for new option c) Added appropriate a-b and r-b tags in commit msg d) Fixed various GLX related dependency checks. v3: Rebased to current master and added changelog in commit msg Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94086 Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:37:25 +01:00
Thomas Hindoe Paaboel Andersen	cbcd7b60f5	nir/lower_double_ops: fix indentation Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-30 12:16:32 -07:00
Thomas Hindoe Paaboel Andersen	21424e019d	nir/opt_dead_cf: fix indentation Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-30 12:16:29 -07:00
Thomas Hindoe Paaboel Andersen	6935726197	nir/opt_dead_cf: correction of side effect check Parenthesis are needed here as ! takes precedence over the &. The check had the opposite effect than intended. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-30 12:16:22 -07:00
Rob Clark	663c0e5155	freedreno/ir3: use pipe_debug_callback for shader-db traces For multi-threaded shader-db support. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:20 -04:00
Rob Clark	2578e3edcb	freedreno/a4xx: add debug callback to emit Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:19 -04:00
Rob Clark	51f20dd279	freedreno/a3xx: add debug callback to emit Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:19 -04:00
Rob Clark	41d288c306	freedreno: wire up core pipe_debug_callback Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:19 -04:00
Rob Clark	e04db879f8	freedreno/ir3: handle color clamp variant ourselves Now that there is a pass to do this in NIR, lets just use that and manage the variants ourself, rather than letting state-tracker do it. This way, mesa/st will precompile shaders without requiring ST_DEBUG=precompile (which requires a debug build). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:19 -04:00
Rob Clark	64abf6d404	nir: clamp-color-output support Handled by tgsi_emulate for glsl->tgsi case. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-04-30 14:56:19 -04:00
Rob Clark	482cdc4c92	freedreno: fix indentation Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:19 -04:00
Marek Olšák	53435514c1	radeonsi: fix synchronization of shader images This fixes the winsys->cs_is_buffer_referenced query, which is used for synchronization before buffers are mapped. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-30 19:36:16 +02:00
Samuel Pitoiset	8f2238ccba	st/glsl_to_tgsi: fix potential crash when allocating temporaries When index - t->temps_size is greater than 4096, allocating space for temporaries on demand will miserably crash. This can happen when a game uses a lot of temporaries like the recent released Tomb raider. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-30 17:41:32 +02:00
Kenneth Graunke	750c38fad1	glsl: Lower vector_extracts to swizzles after lower_vector_derefs. lower_vector_derefs can produce new vector_extract operations. Neither i965 nor st_glsl_to_tgsi can handle them, so we'd best convert them to swizzles. Together with the previous patch, this fixes assertion failures in GLideN64, as well as a new Piglit test which reproduces the issue: spec/glsl-1.10/compiler/vector-dereference-in-dereference.frag Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95164 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-29 16:03:36 -07:00
Kenneth Graunke	1cd600dbb9	glsl: Convert lower_vec_index_to_swizzle to a rvalue visitor. The old visitor missed some cases. For example, it wouldn't handle an ir_dereference_array with a vector_extract as the index. Rather than trying to add the missing cases, just rewrite it as an ir_rvalue_visitor. This makes it easy to replace any expression, and is much less code. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95164 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-29 16:03:29 -07:00
Thomas Faller	d53cf1ea4c	mesa: simplify _mesa_Lightfv Signed-off-by: Thomas Faller <tfaller1@gmx.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-29 11:08:01 -06:00
Nicolai Hähnle	aa6f88f891	gallium/radeon: fix crash in r600_set_streamout_targets Protect against dereferencing a gap in the targets array. This was triggered by a test in the Khronos CTS. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-29 11:55:06 -05:00
Nicolai Hähnle	98c348d26b	st/glsl_to_tgsi: reduce stack explosion in recursive expression visitor In optimized builds, visit(ir_expression *) experiences inlining with gcc that leads the function to have a roughly 32KB stack frame. This is a problem given that the function is called recursively. In non-optimized builds, the stack frame is much smaller, hence one gets crashes that happen only in optimized builds. Arguably there is a compiler bug or at least severe misfeature here. In any case, the easy thing to do for now seems to be moving the bulk of the non-recursive code into a separate function. This is sufficient to convince my version of gcc not to blow up the stack frame of the recursive part. Just to be sure, add the gcc-specific noinline attribute to prevent this bug from reoccuring if inliner heuristics change. v2: put ATTRIBUTE_NOINLINE into macros.h Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95133 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95026 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92850 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-29 11:52:59 -05:00
Nicolai Hähnle	59af21c3e9	tgsi/text: fix parsing of memory instructions Properly handle Target and Format parameters when present. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:56 -05:00
Nicolai Hähnle	4055babc75	tgsi/text: add str_match_name_from_array Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:53 -05:00
Nicolai Hähnle	a56edbdd8f	tgsi/text: add str_match_format helper function Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:51 -05:00
Nicolai Hähnle	acb65a23a3	tgsi/build: pass Memory.Texture and .Format through tgsi_build_full_instruction Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:32 -05:00
Nicolai Hähnle	318d305f6d	tgsi/dump: signal nospace when the last print exceeded the size Previously, there was a bug where nospace wasn't signalled if it just so happened that the very last print exceeded the available space. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:28 -05:00
Nicolai Hähnle	e08eaa5b72	tgsi/dump: shared dump_ctx initialization Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:21 -05:00
Emil Velikov	4b1ea6910e	st/omx: don't return early in vid_enc_EncodeFrame() Earlier commit plugged a memory leak, although it missed a pair of brackets. Thus we unconditionally returned even in the case of no error. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95203 Fixes: `b87856d25d` ("st/omx: Fix resource leak on OMX_ErrorNone") Tested-by: Andy Furniss <adf.lists@gmail.com> Acked-by: Robert Foss <robert.foss@collabora.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> --- What an embarassing bug - missing brackets. Andy can you confirm that it resolves the issue ?	2016-04-29 15:36:18 +01:00
Andres Gomez	c750029b37	glsl: Checks for interpolation into its own function. This generalizes the validation also to be done for variables inside interface blocks, which, for some cases, was missing. For a discussion about the additional validation cases included see https://lists.freedesktop.org/archives/mesa-dev/2016-March/109117.html and Khronos bug #15671. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-04-29 08:03:00 +02:00
Jason Ekstrand	6d4a426745	nir/algebraic: Support lowering for both 64 and 32-bit ldexp Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-28 21:36:52 -07:00
Jason Ekstrand	f0af5b87ec	nir/opcodes: Make ldexp take an explicitly 32-bit int There is no sense in having the double version of ldexp take a 64-bit integer. Instead, let's just take a 32-bit int all the time. This also matches what GLSL does where both variants of ldexp take a regular integer for the exponent argument. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-28 21:36:52 -07:00
Jason Ekstrand	bee40dd730	nir/opcodes: Simplify the expressions for [un]pack_double The new expressions are more explicit in terms of where the bits go so it's a little easier to tell what's going on. This is the way GLSL specifies things so it's a bit easier to verify too. It also has the benifit that the new expressions easily vectorize so we can constant-fold vector forms of the _split versions correctly. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-28 21:36:52 -07:00
Kenneth Graunke	2655265fcb	mesa: Fix indirect draw buffer size check on 32-bit systems. Fixes dEQP-GLES31.functional subtests: draw_indirect.negative.command_offset_not_in_buffer_signed32_wrap draw_indirect.negative.command_offset_not_in_buffer_unsigned32_wrap These tests use really large values that overflow GLsizeiptr, at which point the buffer size isn't less than "end". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95138 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2016-04-28 16:31:45 -07:00
Jason Ekstrand	70f89dd75e	nir: Switch the arguments to nir_foreach_def This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_def($[^,]$,\s$[^,]*$)/nir_foreach_def(\2, \1)/ Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	5015260a05	nir: Switch the arguments to nir_foreach_use and friends This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_use($[^,]$,\s$[^,]*$)/nir_foreach_use(\2, \1)/ and similar expressions for nir_foreach_use_safe, etc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	9464d8c498	nir: Switch the arguments to nir_foreach_function This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_function($[^,]$,\s$[^,]*$)/nir_foreach_function(\2, \1)/ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	e63766fb4b	nir: Switch the arguments to nir_foreach_parallel_copy_entry This matches the "foreach x in container" pattern found in many other programming languages. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	8564916d01	nir: Switch the arguments to nir_foreach_phi_src This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_phi_src($[^,]$,\s$[^,]*$)/nir_foreach_phi_src(\2, \1)/ and a similar expression for nir_foreach_phi_src_safe. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	707e72f13b	nir: Switch the arguments to nir_foreach_instr This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_instr($[^,]$,\s$[^,]*$)/nir_foreach_instr(\2, \1)/ and similar expressions for nir_foreach_instr_safe etc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	261d62de33	anv/lower_push_constants: fixup for nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Jason Ekstrand	bb65764a4a	anv/apply_pipeline_layout: fixup for nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Jason Ekstrand	621cbc0c14	anv/apply_dynamic_offsets: fixup for nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	7efff10585	i965/nir: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	3a8688fb41	nir/algebraic: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	1f8c100614	nir/validate: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	a471c161b1	nir/nir_worklist: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	db35177772	nir/remove_dead_variables: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	b3aaae398e	nir/split_var_copies: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	9d41a1ffeb	nir/repair_ssa: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	480a182ccd	nir/opt_peephole_select: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	e5f37701ab	nir/phi_builder: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	1ba40d834b	nir/opt_cp: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	8dd7d78925	nir/opt_remove_phis: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	1a8c17a59e	nir/opt_undef: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	52affdd2e6	nir/opt_dead_cf: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	ddc6639f85	nir/opt_dce: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	3afb3be674	nir/opt_gcm: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	eecf96f530	nir/opt_constant_folding: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	26b4c9ee15	nir/lower_samplers: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	f4ebff89e4	nir/normalize_cubemap_coords: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	492b3554a7	nir/lower_var_copies: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	c1b37c08bf	nir/move_vec_src_uses_to_dest: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	ceed12557d	nir/lower_vars_to_ssa: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	1557344c81	nir/lower_vec_to_movs: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	b1eada04b2	nir/lower_idiv: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	2febb88e6d	nir/lower_to_source_mods: fixup for new foreeach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	c81ca60b41	nir/lower_io: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	7e909972e3	nir/lower_system_values: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	76c74de456	nir/lower_phis_to_scalar: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	b89f0bb58c	nir/lower_indirect_derefs: fixup for new foreach_block() v2 (Jason Ekstrand): Use nir_foreach_block_safe Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	e3c5bda16a	nir/nir_lower_global_vars: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	480d78f55b	nir/lower_atomics: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	06cf73a7ba	nir/lower_load_const: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	15264133d7	nir/lower_locals_to_regs: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	1c6307aab4	nir/lower_gs_intrinsics: fixup for new foreach_block() v2 (Jason Ekstrand): Use nir_foreach_block_safe Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	3bf3100794	nir/nir: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	686f247b21	nir/lower_clip: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	e36fbcfc3f	nir/lower_alu_to_scalar: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	4179a56f42	nir/liveness: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	34af78edb3	nir/inline_functions: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	b23e59e172	nir/from_ssa: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	d6a6c729ca	nir/dominance: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Samuel Pitoiset	9f92a8f00a	nvc0: stick compute kernel arguments into uniform_bo Having one buffer object for input kernel arguments coming from clover and an other one for OpenGL user uniforms is unnecessary. Using the uniform_bo object for both GL/CL uniforms avoids to declare a new BO. This only affects compute programs but it should not hurt anything because the states are dirtied and data will get reuploaded. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-29 00:44:08 +02:00
Tim Rowley	124a5d4ca0	swr: remove duplicated constant update code Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-28 16:16:46 -05:00
Marek Olšák	1a8c2ccb24	gallium/radeon: add the size only once in r600_context_add_resource_size Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-04-28 21:06:31 +02:00
Bas Nieuwenhuizen	8e43bc0eb6	winsys/radeon: enlarge buffer_indices_hashlist Enlarge the buffer hashlist to prevent large numbers of misses due to adding more buffers than can be cached in the hashlist. Ported from winsys/amdgpu: `6373845d98` Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-04-28 21:06:31 +02:00
Marek Olšák	92f6af2c4a	gallium/radeon: drop support for LINEAR_GENERAL layout Unused. All texture imports use LINEAR_ALIGNED regardless of what the DDX does. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-04-28 20:16:56 +02:00
Marek Olšák	f564b61d33	radeonsi: rework clear_buffer flags Changes: - don't flush DB for fast color clears - don't flush any caches for initial clears - remove the flag from si_copy_buffer, always assume shader coherency Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-28 20:16:56 +02:00
Jason Ekstrand	d273ce5259	anv/dynamic_offsets: Fix the order of arguments to nir_build_imm	2016-04-28 11:05:56 -07:00
Jason Ekstrand	6028a67641	anv: Fix a build error caused by recent fp64 NIR changes	2016-04-28 10:13:42 -07:00
Jose Fonseca	99474dc29b	nir: Try to warn when C99 extensions are used in nir headers. Ideally we'd have nir.h being included with -Wpedantic too, but it fails with: src/compiler/nir/nir.h:754:20: warning: ISO C++ forbids zero-size array ‘src’ [-Wpedantic] nir_alu_src src[]; ^ In file included from src/compiler/nir/glsl_to_nir.cpp:42:0: src/compiler/nir/nir.h:919:16: warning: ISO C++ forbids zero-size array ‘src’ [-Wpedantic] nir_src src[]; Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-04-28 16:48:13 +01:00
Jose Fonseca	e7438009af	nir: Remove spurious ; after nir_builder functions. Makes -pedantic happy. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-04-28 16:48:12 +01:00
Jose Fonseca	caa5937ebb	nir: Remove spurious ; after namespace. Makes -pedantic happy. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-04-28 16:48:12 +01:00
Jose Fonseca	f7854d8227	nir: Avoid C99 field initializers. As they are not standard C++ and are not supported by MSVC C++ compiler. Just have nir_imm_double match nir_imm_float above. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-04-28 16:48:12 +01:00
Brian Paul	a609da60c0	gallium/util: s/Elements/ARRAY_SIZE/ Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-28 09:04:24 -06:00
Brian Paul	f365488eaa	mesa: improve comment on _mesa_check_disallowed_mapping(), return bool The old comment was a bit terse. Also, change the function return type to bool. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 09:04:17 -06:00
Marek Olšák	7e7710a068	radeonsi: remove needless cache flushes at the end of CP DMA operations not needed AFAIK Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-28 12:46:47 +02:00
Marek Olšák	7d49b459b6	radeonsi: remove flushes at the beginning and end of IBs done by the kernel Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-28 12:46:47 +02:00
Samuel Iglesias Gonsálvez	db07b46f2c	nir: Add lrp lowering for doubles in opt_algebraic Some hardware (i965 on Broadwell generation, for example) does not support natively the execution of lrp instruction with double arguments. Add 'lower_flrp64' flag to lower this instruction in that case. v2: - Rename lower_flrp_double to lower_flrp64 (Jason) - Fix typo (Jason) - Adapt the code to define bit_size information in the opcodes. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 12:01:40 +02:00
Samuel Iglesias Gonsálvez	443600d51e	nir: rename lower_flrp to lower_flrp32 A later patch will add lower_flrp64 option to NIR. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 12:01:40 +02:00
Iago Toral Quiroga	072613b3f3	nir/lower_double_ops: lower round_even() At least i965 hardware does not have native support for round_even() on doubles. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-28 12:01:40 +02:00
Iago Toral Quiroga	bf91df7f7f	nir/lower_double_ops: lower fract() At least i965 hardware does not have native support for fract() on doubles. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 12:01:40 +02:00
Iago Toral Quiroga	126a1ac03f	nir/lower_double_ops: lower ceil() At least i965 hardware does not have native support for ceil on doubles. v2 (Sam): - Improve the lowering pass to remove one bcsel (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 12:01:36 +02:00
Iago Toral Quiroga	29541ec531	nir/lower_double_ops: lower floor() At least i965 hardware does not have native support for floor on doubles. v2 (Sam): - Improve the lowering pass to remove one bcsel (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 11:58:35 +02:00
Iago Toral Quiroga	5fab3d178b	nir/lower_double_ops: lower trunc() At least i965 hardware does not have native support for truncating doubles. v2: - Simplified the implementation significantly. - Fixed the else branch, that was not doing what we wanted. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 11:58:13 +02:00
Connor Abbott	2ea3649c63	nir: add a pass to lower some double operations v2: Move to compiler/nir (Iago) v3: Use nir_imm_int() to load the constants (Sam) v4 (Sam): - Undo line-wrap (Jason). - Fix comment (Jason). - Improve generated code for get_signed_inf() function (Connor). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 11:58:13 +02:00
Connor Abbott	2cf3b28884	nir/builder: add nir_imm_double() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 11:58:13 +02:00
Samuel Iglesias Gonsálvez	3a150683ce	nir/builder: Add bit_size info to nir_build_imm() v2: - Group num_components and bit_size together (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 11:58:13 +02:00
Jakob Sinclair	76b8c5cc60	radeonsi: check if value is negative Fixes a Coverity defect by adding checks to see if a value is negative before using it to index an array. By checking the value first it makes the code a bit safer but overall should not have a big impact. CID: 1355598 Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-04-28 11:33:38 +02:00
Michel Dänzer	860210ccfc	clover: Fix build against clang SVN >= r267772 (Re-pushing previous fix for clang SVN r265359, which was reverted in the meantime) Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-04-28 12:57:03 +09:00
Lars Hamre	32cb7d61a9	glsl: fix lowering outputs for early/nested returns Return statements in conditional blocks were not having their output varyings lowered correctly. This patch fixes the following piglit tests: /spec/glsl-1.10/execution/vs-float-main-return /spec/glsl-1.10/execution/vs-vec2-main-return /spec/glsl-1.10/execution/vs-vec3-main-return Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-28 11:01:51 +10:00
Connor Abbott	122d27e998	nir: rewrite nir_foreach_block and friends Previously, these were functions which took a callback. This meant that the per-block code had to be in a separate function, and all the data that you wanted to pass in had to be a single void *. They walked the control flow tree recursively, doing a depth-first search, and called the callback in a preorder, matching the order of the original source code. But since each node in the control flow tree has a pointer to its parent, we can implement a "get-next" and "get-previous" method that does the same thing that the recursive function did with no state at all. This lets us rewrite nir_foreach_block() as a simple for loop, which lets us greatly simplify its users in some cases. This does require us to rewrite every user, although the transformation from the old nir_foreach_block() to the new nir_foreach_block() is mostly trivial. One subtlety, though, is that the new nir_foreach_block() won't handle the case where the current block is deleted, which the old one could. There's a new nir_foreach_block_safe() which implements the standard trick for solving this. Most users don't modify control flow, though, so they won't need it. Right now, only opt_select_peephole needs it. The old functions are reimplemented in terms of the new macros, although they'll go away after everything is converted. v2: keep an implementation of the old functions around v3 (Jason Ekstrand): A small cosmetic change and a bugfix in the loop handling of nir_cf_node_cf_tree_last(). v4 (Jason Ekstrand): Use the _safe macro in foreach_block_reverse_call Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-27 15:05:40 -07:00
Connor Abbott	958300137f	nir/opt_cp: use nir_block_get_following_if() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-27 15:05:34 -07:00
Jordan Justen	aaaa22c775	vbo: Return INVALID_OPERATION during draw with a mapped buffer Fixes the OpenGLES 3.1 CTS: * ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos Because this is triggering the error message after the normal API validation phase, we don't have the API function name available, and therefore we generate an error message without the draw call name: Mesa: User error: GL_INVALID_OPERATION in draw call (vertex buffers are mapped) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95142 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 14:30:06 -07:00
Nanley Chery	28d0bc72fb	anv/formats: Return proper error code for unsupported formats Fixes some failures in dEQP-VK.api.info.image_format_properties.* and enables the test group to execute without assert failing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-27 11:28:30 -07:00
Nanley Chery	5f7e8eac42	anv/device: Set the compressed texture feature flags correctly Sampling from an ETC2 texture is supported on Bay Trail and from Gen8 onwards. While ASTC_LDR is supported on Gen9, the logic to handle such formats has not yet been implemented in the driver. Fixes dEQP-VK.api.info.format_properties.compressed_formats. v2: Enable ETC2 for Bay Trail (Kenneth Graunke) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-27 11:28:30 -07:00
Jason Ekstrand	e0806930ad	nir/algebraic: Add a bit-size validator This commit adds a validator that ensures that all expressions passed through nir_algebraic are 100% non-ambiguous as far as bit-sizes are concerned. This way it's a compile-time error rather than a hard-to-trace C exception some time later. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-27 11:21:06 -07:00
Jason Ekstrand	8a3e344180	nir/opt_algebraic: Fix some expressions with ambiguous bit sizes Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-27 11:21:06 -07:00
Jason Ekstrand	7e0ee3a38b	nir/search: Respect the bit_size parameter on nir_search_value Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-27 11:21:06 -07:00
Jason Ekstrand	fcc1c8a437	nir/algebraic: Add a mechanism for specifying the bit size of a value Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-27 11:21:06 -07:00
Jason Ekstrand	cafb885e45	nir/algebraic: Use "uint" instead of "unsigned" for uint types This is consistent with the rename done for the rest of NIR. Currently, "bool" is the only type specifier used in nir_opt_algebraic.py so this is really a no-op. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-27 11:21:06 -07:00
Jason Ekstrand	736ee0bef7	nir/algebraic: Do better error reporting of bad expressions Previously, if an exception was encountered anywhere, nir_algebraic would just die in a fire with no indication whatsoever as to where the actual bug is. This commit makes it print out the particular search-and-replace expression that is causing problems along with the exception. Also, it will now report all of the errors it finds and then exit at the end like a standard C compiler would do. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-27 11:21:06 -07:00
Alejandro Piñeiro	b1dcedf393	isl: move -lm at the end of tests_ldadd The test was failing to build with "undefined reference to `roundf'" errors, so Make check on mesa was failing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-27 20:14:56 +02:00
Topi Pohjolainen	aef6a6c382	i965/blorp/gen8: Fix blitting of interleaved msaa surfaces Fixes ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8_multisample. Current logic divides given layer of one by number of samples (four) trashing the layer to zero. Layer adjustment is only to be used with non-interleaved msaa surfaces where samples for particular layer are in multiple slices. I copy-pasted a bit of documentation from brw_blorp.c::brw_blorp_compute_tile_offsets(). Also took the opportunity to fix the comment regarding sampling as 2D, cube textures are the only exception. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-27 19:57:40 +03:00
Brian Paul	1d242b6882	llvmpipe: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	23c55e5c23	tgsi: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	419e386571	os: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	d902504a67	hud: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	e522a76226	gallivm: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	489df4a71a	draw: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	f93802c465	softpipe: s/Elements/ARRAY_SIZE/ Try to standardize on the later, which is defined in the common util/ directory. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Nicolai Hähnle	562c4a17b7	winsys/radeon: remove use_reusable_pool parameter from buffer_create All callers set this parameter to true. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:41 -05:00
Nicolai Hähnle	13acf2b243	gallium/radeon: remove use_reusable_pool parameter from r600_init_resource All callers set it to true. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:41 -05:00
Nicolai Hähnle	c868974396	radeon/video: always use the reusable buffer pool A semantic error was introduced in a past refactoring that caused the bind parameter to be passed into the use_reusable_pool parameter of buffer_create. Since this clearly makes no sense, and there is no clear reason why the cache _shouldn't_ be used, just use the cache always. Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:41 -05:00
Nicolai Hähnle	8c43c06e04	radeonsi: work around an MSAA fast stencil clear problem A piglit test (arb_texture_multisample-stencil-clear) has been sent. This problem was discovered analyzing Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:40 -05:00
Nicolai Hähnle	7a215a3e27	radeonsi: expclear must be disabled on first Z/S clear The documentation and the HW team say so. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:40 -05:00
Nicolai Hähnle	01a3bb5d8b	radeonsi: move blend choice out of loop in si_blit_decompress_color It does not depend on the level or layer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:40 -05:00
Nicolai Hähnle	450ff0f0d5	radeonsi: use level mask for early out in si_blit_decompress_color Mostly for consistency with the other decompress functions, but note that in the non-DCC decompress case, the function can now early-out in slightly more (albeit probably rare) cases. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:40 -05:00
Nicolai Hähnle	0ff05b55c6	radeonsi: si_blit_decompress_depth is only used for staging Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:40 -05:00
Nicolai Hähnle	0b70fc2db4	radeonsi: only decompress the required ZS planes from si_blit This happens to "fix" a rendering bug in KotOR2, because it avoids a still not quite understood bug with MSAA fast stencil clear decompress. For the stencil clear bug, I have sent a piglit test (arb_texture_multisample-stencil-clear). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Nicolai Hähnle	def53a0b3d	radeonsi: decompress Z & S planes in one pass Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Nicolai Hähnle	dc6fc2f390	radeonsi: early out of si_blit_decompress_depth_in_place based on dirty mask Avoid dirtying the db_render_state atom when possible. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Nicolai Hähnle	d14d6c3f58	radeonsi: use MIN2 instead of expanded ?: operator Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Nicolai Hähnle	159f182a57	radeonsi: fix brace style Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Nicolai Hähnle	91fb4bb2e9	gallium/util: add u_bit_consecutive for generating a consecutive range of bits There are some undefined behavior subtleties, so having a function to match the u_bit_scan_consecutive_range makes sense. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Tim Rowley	504df3a1d7	swr: s/Elements/ARRAY_SIZE/ Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 11:07:34 -05:00
Nicolai Hähnle	836cab51c8	radeonsi: emit s_waitcnt for shader memory barriers and volatile Turns out that this is needed after all to satisfy some strengthened coherency tests. Depends on support in LLVM, added in r267729. v2: updated to reflect changes to the LLVM intrinsic Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-04-27 10:54:05 -05:00
Tim Rowley	e7201bd31b	swr: [rasterizer] warning cleanup Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:41:54 -05:00
Tim Rowley	24f23817d2	swr: [rasterizer core] implement legacy depth bias enable Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:41:45 -05:00
Tim Rowley	fa36f8ec9c	swr: [rasterizer jitter] support for dumping x86 asm Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:41:32 -05:00
Tim Rowley	a646ffdacf	swr: [rasterizer core] more backend refactoring BackendPixelRate should be easier to read/maintain now hopefully. Small perf bump by moving some of the pfn's to inline functions without template params. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:41:21 -05:00
Tim Rowley	8e815ff72c	swr: [rasterizer jitter] add mSimdInt1Ty Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:41:12 -05:00
Tim Rowley	4e1e0b3a32	swr: [rasterizer core] backend refactor Lump all template args into a bundle of traits, and add some functionality to the MSAA traits. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:40:44 -05:00
Brian Paul	43f46caf76	svga: use the SVGA3D_DEVCAP_MAX_FRAGMENT_SHADER_INSTRUCTIONS query Instead of a hard-coded 512. The query typically returns 65536 now. Fall back to 512 if the query fails as we do for vertex shaders (which should never happen). Note that we don't actually enforce this limit in our shaders but it gets reported via the glGetProgramivARB(GL_MAX_PROGRAM_INSTRUCTIONS_ARB) query. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-04-27 08:43:33 -06:00
Hans de Goede	b5e7907f30	nouveau: codegen: LOAD: Take src swizzle into account The llvm TGSI backend uses pointers in registers and does things like: LOAD TEMP[0].y, MEMORY[0], TEMP[0] Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be loaded instead. This commit adds support for a swizzle suffix for the 1st source operand, which allows using: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] And actually getting the desired behavior Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-27 16:11:48 +02:00
Hans de Goede	90f45357ab	nouveau: codegen: LOAD: Do not call fetchSrc(1) if the address is immediate "off" later gets set to NULL when the address is immediate, so move the fetchSrc(1) call to the non-immediate branch of the if-else. This brings handleLOAD's offset handling inline with how it is done in handleSTORE. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-27 16:11:48 +02:00
Hans de Goede	1958397a58	nouveau: codegen: LOAD: Always use component 0 when getting the address LOAD loads upto 4 components from the specified resource starting at the passed in x value of the 2nd source operand, the y, z and w components of the address should not be used. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-27 16:11:48 +02:00
Stefan Dirsch	7d25ed7036	dri3: Check for dummyContext to see if the glx_context is valid According to the comments in src/glx/glxcurrent.c __glXGetCurrentContext() always returns a valid pointer. If no context is made current, it will contain dummyContext. Thus a test for NULL will always fail. https://lists.freedesktop.org/archives/mesa-dev/2016-April/113962.html Signed-off-by: Stefan Dirsch <sndirsch@suse.de> Reviewed-by: Egbert Eich <eich@freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-27 13:03:34 +01:00
Egbert Eich	4d9b518ad2	dri2: Check for dummyContext to see if the glx_context is valid According to the comments in src/glx/glxcurrent.c __glXGetCurrentContext() always returns a valid pointer. If no context is made current, it will contain dummyContext. Thus a test for NULL will always fail. https://bugzilla.opensuse.org/show_bug.cgi?id=962609 Tested-by: Olaf Hering <ohering@suse.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-27 13:03:11 +01:00
Timothy Arceri	6d1a59d15b	glsl: move uniform block validation to link_uniform_blocks.cpp Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-27 16:17:47 +10:00
Kenneth Graunke	73ada723f0	docs: Mention that {ARB,OES}_texture_stencil8 is supported on i965/gen8+ Thanks to Thomas Helland for reminding me to do this.	2016-04-26 21:32:35 -07:00
Kenneth Graunke	fd9a7d8f30	i965: Enable ARB_texture_stencil8 and OES_texture_stencil8 on Gen8+. Stencil texturing is required by ES 3.1. Apparently we never actually turned it on. Do that now. Also turn on the desktop extension. Fixes nine dEQP-GLES31.functional tests: stencil_texturing.format.stencil_index8_2d texture.border_clamp.formats.stencil_index8.nearest_size_pot texture.border_clamp.formats.stencil_index8.nearest_size_npot texture.border_clamp.formats.stencil_index8.gather_size_pot texture.border_clamp.formats.stencil_index8.gather_size_npot texture.border_clamp.unused_channels.stencil_index8 state_query.internal_format.renderbuffer.stencil_index8_samples state_query.internal_format.texture_2d_multisample.stencil_index8_samples state_query.internal_format.texture_2d_multisample_array.stencil_index8_samples Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-04-26 21:32:35 -07:00
Kenneth Graunke	12c43a355c	mesa: Try to fix CopyTex[Sub]Image of stencil textures. ES prohibits this, but GL appears to allow it. We at least need this much, or else we'll crash as there's no source to read from. This fixed crashes in the ES tests before I realized I needed to prohibit stencil instead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-04-26 21:32:35 -07:00
Kenneth Graunke	027c6c1222	mesa: Disallow CopyTexSubImage on stencil formats in ES. Fixes - ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8 - ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8_multisample Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-04-26 21:32:35 -07:00
Kenneth Graunke	1e44599a43	i965: Fix MapTextureImage for multi-slice/level stencil buffers. We called intel_miptree_get_image_offset() to get the image offsets for the current level/slice, but then proceeded to ignore the results and clobber level/slice 0 every time. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94713 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-04-26 21:32:35 -07:00
Kenneth Graunke	361a24e140	i965: Move TCS output indirect_offset.file check out a level. I want to add another condition. Moving the indirect_offset.file check out a level should make this a little easier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-26 19:59:56 -07:00
Kenneth Graunke	13195f7ef8	i965/fs: Reduce the response length of sampler messages on Skylake. Often, we don't need a full 4 channels worth of data from the sampler. For example, depth comparisons and red textures only return one value. To handle this, the sampler message header contains a mask which can be used to disable channels, and reduce the message length (in SIMD16 mode on all hardware, and SIMD8 mode on Broadwell and later). We've never used it before, since it required setting up a message header. This meant trading a smaller response length for a larger message length and additional MOVs to set it up. However, Skylake introduces a terrific new feature: for headerless messages, you can simply reduce the response length, and it makes the implicit header contain an appropriate mask. So to read only RG, you would simply set the message length to 2 or 4 (SIMD8/16). This means we can finally take advantage of this at no cost. total instructions in shared programs: 9091831 -> 9073067 (-0.21%) instructions in affected programs: 191370 -> 172606 (-9.81%) helped: 2609 HURT: 0 total cycles in shared programs: 70868114 -> 68454752 (-3.41%) cycles in affected programs: 35841154 -> 33427792 (-6.73%) helped: 16357 HURT: 8188 total spills in shared programs: 3492 -> 1707 (-51.12%) spills in affected programs: 2749 -> 964 (-64.93%) helped: 74 HURT: 0 total fills in shared programs: 4266 -> 2647 (-37.95%) fills in affected programs: 3029 -> 1410 (-53.45%) helped: 74 HURT: 0 LOST: 1 GAINED: 143 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-26 19:55:04 -07:00
Jason Ekstrand	d800b7daa5	nir: Add a helper for figuring out what channels of an SSA def are read Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-26 19:55:04 -07:00
Jason Ekstrand	acc2f1fe36	i965/fs: Use inst->regs_written for rlen for texture instructions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-26 19:55:04 -07:00
Jason Ekstrand	c7a09c0571	i965/fs: Properly report regs_written from SAMPLEINFO The previous behavior would only allocate one register and then write four thus potentially stomping three innocent bystanders. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-26 19:55:04 -07:00
Jason Ekstrand	30b37e4e9b	i965/blorp: Set regs_written on texturing instructions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-26 19:55:04 -07:00
Kenneth Graunke	0bd956b34b	i965: Don't force a header for texture offsets of 0. Calling textureOffset() with an offset of <0, 0, 0> is equivalent to calliing texture(). We don't actually need to set up an offset, which causes a message header to be created. A fairly common pattern is to sample at a point with a bunch of offsets, and average them. It's natural to write all the lookups as textureOffset, but use <0, 0> for the center sample. shader-db results on Skylake: total instructions in shared programs: 9092095 -> 9092087 (-0.00%) instructions in affected programs: 2826 -> 2818 (-0.28%) helped: 12 HURT: 2 total cycles in shared programs: 70870166 -> 70870144 (-0.00%) cycles in affected programs: 15924 -> 15902 (-0.14%) helped: 2 HURT: 0 This also helps prevent code quality regressions in a future patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by Jason Ekstrand <jason@jlekstrand.net>	2016-04-26 19:55:04 -07:00
Patrick Rudolph	fb5d38e219	r600g: fix and optimize tgsi_cmp when using ABS and NEG modifier Some apps set NEG and ABS on the source param to test for zero. Use ALU_OP3_CNDE insted of ALU_OP3_CNDGE and unset both modifiers. It also removes the need for a MOV instruction, as ABS isn't supported on op3. Tested on AMD CAYMAN and AMD RV770. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 12:48:50 +10:00
Dave Airlie	7aa3a93656	docs: update softpipe for ARB_compute_shader Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:01:12 +10:00
Dave Airlie	e749c30ceb	softpipe: add support for compute shaders. (v2) This enables ARB_compute_shader on softpipe. I've only tested this with piglit so far, and I hopefully plan on integrating it with my vulkan work. I'll get to testing it with deqp more later. The basic premise is to create up to 1024 restartable TGSI machines, and execute workgroups of those machines. v1.1: free machines. v2: deqp fixes - add samplers support, finish atomic operations, fix load/store writemasks. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:01:03 +10:00
Dave Airlie	f78bcb7638	tgsi/exec: initialise SysSemanticToIndex array to -1 We want to use the SysSemanticToIndex to tell if we've seen the semantics at all. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:00:46 +10:00
Dave Airlie	fbea4e177f	tgsi/exec: implement restartable machine. This lets us restart the machine at a PC value, and exits the machine when we hit a barrier. Compute shaders will then execute all the threads up to the barrier, then restart the machines after the barrier once all are done. v2: comment the code a bit, change return types. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:00:44 +10:00
Dave Airlie	8ffa3c58d4	tgsi/exec: make inputs/outputs optional for compute shaders. compute shaders don't need input/outputs so don't bother allocating memory for these. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:00:41 +10:00
Dave Airlie	16a9dc1e49	tgsi/exec: implement load/store/atomic on MEMORY. This implements basic load/store/atomic ops on MEMORY types for compute shaders. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:00:35 +10:00
Dave Airlie	354c5f2d0f	tgsi/exec: split out setting up masks to separate function This is just a cleanup that will make later changes easier to make. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 08:56:22 +10:00
Dave Airlie	6cf36a7231	tgsi: accept a starting PC value for exec machine. This will be used later to restart barriered execution threads in compute, for now we just want to change the API. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 08:56:17 +10:00
Dave Airlie	912ed84f83	tgsi: move to using vector for system values. For compute support some of the system values are .xyz types, so move to using a vector instead of a single channel. [airlied: squash swizzle fix from compute series]. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 08:26:53 +10:00
Dave Airlie	9013d9267c	tgsi/exec: fix system value handling. a) SysSemanticToIndex needs to be indexed with the semantic name not the decl->Declaration.Semantic. b) doing this in run is too late, as the mappings are all setup prior to run in the execs. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 08:25:38 +10:00
Jason Ekstrand	4040fff81d	i965/blorp: Convert state setup to C Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	71775afe6e	i965/blorp: Make state setup C-safe Previously they (very rarely) used C++isms that prevented them from being compiled as C. As of this commit, they can be compiled as either C or C++. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	bed74299c2	i965/blorp: Convert brw_blorp.cpp to a C file Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	0551f3dfa4	i965/blorp: Make all of brw_blorp.h accessible to C Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	b3f08b5424	i965/blorp: Turn brw_blorp_params into a C-style struct Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	33fa12c50f	i965/blorp: Turn coord_transform into a C-style struct Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	b6dd8e42f0	i965/blorp: Turn blorp_surface_info into a C-style struct This commit is mostly mechanical except that it changes where we set the swizzle. Previously, the blorp_surface_info constructor defaulted the swizzle to SWIZZLE_XYZW. Now, we memset to zero and fill out the swizzle when we setup the rest of the struct. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	a543f741bf	i965/blorp: Roll mip_info into surface_info Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	3839936497	i965/blorp: Get rid of the blorp_blit_params class It was really just a wrapper around the function that constructed it. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	8096ed7e27	i965/blorp: Remove the hiz params class Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	e35d9407dc	i965/blorp: Remove the clear params classes They didn't really add anything other than a key and extra layers of function calls. This commit just inlines the extra functions and gets rid of the extra classes. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	659400cba3	i965/blorp: Remove the arguments to brw_blorp_params() No one was using anything other than the defaults. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	2dda4ff014	i965/blorp: Refactor to get rid of the get_wm_prog virtual function Instead of having a virtual member function for getting the WM/PS kernel, we simply add fields for prog_data and the kernel to brw_blorp_parms and always make sure those get set as part of the different constructors. v2: Use use prog_data != NULL to check for a valid program instead of a magic kernel offset value Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Tim Rowley	18d1658633	swr: autogenerate swr_context_llvm.h Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-26 16:45:26 -05:00
Laurent Carlier	12cf08fcc3	anv: honor DESTDIR when installing icd file https://bugs.freedesktop.org/show_bug.cgi?id=94969 Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:57:54 -07:00
Juha-Pekka Heikkila	ec5f7fc7bd	i965/meta: initialize values to avoid random behaviour on error path if brw_meta_stencil_blit() errored at wrong place 'target' would be uninitialized and cause random behaviour on leaving the funtion. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:54:29 -07:00
Juha-Pekka Heikkila	51632d6f27	meta: Avoid random memory access on error Initialize drawFb to NULL in _mesa_meta_CopyImageSubData_uncompressed() if getting readFb fails uninitialized drawFb will cause randomness on cleanup. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:54:02 -07:00
Grazvydas Ignotas	cea3a7e615	mesa: add tags file to gitignore For ctags users like me. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:49:27 -07:00
Jakob Sinclair	dda50af9c4	mesa: Remove every double semi-colon Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	e5d027ec7d	glx: Remove every double semi-colon Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	ea327dc451	gallium: Remove every double semi-colon Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	de743a07ac	egl: Remove every double semi-colon Removes all accidental semi-colons in egl. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	e129e6eb89	gallium/r600: removing double semi-colons Trivial change. Removing unnecessary semi-colons from the code. I don't have push access so someone reviewing this can push it. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	12da8bb5f4	mesa/main: removing double semi-colons Trivial change. Removing unnecessary semi-colons from the code. I don't have push access so someone reviewing this can push it. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	09e4ac00ac	glsl: removing double semi-colons Trivial change. Removing unnecessary semi-colons from the code. I don't have push access so someone reviewing this can push it. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jose Fonseca	52c7443932	glx: Don't enclose includes inside `extern "C" { }`. Ran `make check` inside src/glx to verify everything compiles and links correctly. https://bugs.freedesktop.org/show_bug.cgi?id=95158 Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-26 21:28:34 +01:00
Marek Olšák	80e5fb60b4	radeonsi: add RW_BUFFERS only once in si_ce_needed_cs_space Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-26 21:37:07 +02:00
Marek Olšák	2b4b5ebfcf	egl: fix make check broken by interop support	2016-04-26 21:37:07 +02:00
Samuel Pitoiset	e64ee4cf60	docs: mark ARB_compute_shader as done for nvc0 This has been merged few months ago but this should help https://mesamatrix.net/ to update its list of supported extensions. Please note that compute shaders are not really useful without ARB_image_load_store and only GK104 and GK110 support it for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-26 21:10:10 +02:00
Samuel Pitoiset	5c429f88d9	nvc0: expose GLSL version 420 on GK110 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	a0e777f6a1	nvc0: enable ARB_shader_image_load_store on GK110 This exposes 8 images for all shader types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	2daaa5d657	gk110/ir: add emission for VSHL Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	af5925209d	gk110/ir: add emission for OP_SUEAU, OP_SUBFM and OP_SUCLAMP Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	1f8900a8e0	gk110/ir: add emission for OP_SULDB and OP_SUSTx Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	fddd8523d4	gk110/ir: add emission for OP_MADSP Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	c2ce22ca46	gk110/ir: add emission for OP_PERMT Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	222d1a1bff	nvc0: expose GLSL version 420 on GK104 Other chipsets will be added later. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Ilia Mirkin	9e367ed480	nvc0: enable ARB_shader_image_load_store on GK104 This exposes 8 images for all shader types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	0d64d39e81	nvc0: inform users that 3D images are not fully supported 3D images are a bit more complicated to implement and will probably requires a bunch of headaches and we don't care for now because they do not seem to be really used by apps. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	fdbb476829	nvc0: reduce GL_MAX_3D_TEXTURE_SIZE to 2048 on Kepler+ The blob sets it to 2048 and using 4096 reports an INVALID_DATA error with RT_ARRAY_MODE when z is 4096. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	6fc6d548ed	nvc0/ir: check that the image format doesn't mismatch This re-uses NVE4_SU_INFO_CALL which is not used anymore because we don't use our lib for format conversions. While we are at it, add a todo for image buffers because there are some robustness-related issues to fix. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	fbeb69757c	nvc0/ir: prevent out of bounds when no images are bound Checking if the image address is not 0 should be enough to prevent read faults. To improve robustness, make sure that the destination value of atomic operations is correctly initialized in case the instruction is not performed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	5ba5714483	nvc0/ir: add indirect support for images on Kepler This fixes arb_shader_image_load_store-indexing and arb_shader_image_load_store-max-images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	8b540db44c	nvc0/ir: fix 1D arrays images for Kepler For 1D arrays, the array index is stored in the Z component. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	e478156ed7	nvc0/ir: fix cube images for Kepler Like 2d array images, the z-dimension needs to be clamped. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Ilia Mirkin	3ce80f924d	nv50/ir: add support for SULDP -> SULDB conversion This will allow to convert surface formats without adding an extra call to our lib. [hakzsam: make use of this for GK104] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	d64ea4e48e	nv50/ir: make use of OP_SUQ for surfaces query This implements RESQ for surfaces which comes from imageSize() GLSL bultin. As the dimensions are sticked into the driver constant buffer, this only has to be lowered with loads. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v2)	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	7c47db359e	nv50/ir: add OP_BUFQ for buffers query TGSI RESQ allows both images and buffers but we have to make a distinction between these two type of resources in our lowering pass. Introducing OP_BUFQ which is a fake operand will allow to implement OP_SUQ for surfaces. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	e09434047d	nv50/ir: enable early fragment test with explicit user control This feature can be enabled in two ways: as an optimization and by explicit user control (with OpenGL 4.2 or ARB_shader_image_load_store). This makes use of the recent TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL to force early fragment tests when needed. This fixes a bunch of dEQP-GLES31.functional.image_load_store.early_fragment_tests.* tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	08f4faa542	nvc0/ir: fix constraints for OP_SUSTx on Kepler Destination type is actually always 32-bits, so typeSizeof() returns 4 and no sources are condensed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	119d087758	nv50/ir: re-introduce TGSI lowering pass for images This is loosely based on the previous lowering pass wrote by calim four years ago. I did clean the code and fixed some issues. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	76ea143c38	nv50/ir: add support for TGSI image declarations Old and dead resource code will be removed once images are completely done. Based on original patch by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	1fb3cd2489	nvc0: add missing glMemoryBarrier bits This fixes a bunch of subtests of arb_shader_image_load_store-host-mem-barrier. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	9bc18a48f3	nvc0: enable RGB10_A2UI format on GK104 No clue why this was not enabled by default before, maybe because the SULDP conversion was wrong. Anyway, this helps in fixing all rgb10_a2ui piglit tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	da8171dc75	nvc0: shift address with blocksize for image buffers This fixes a bunch of dEQP image buffers related tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	285f2edd14	nvc0: fix address offset when images have multiple levels This fixes arb_shader_image_load_store-level. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	e28f247e24	nvc0: bind images on 3D shaders for Kepler Similar to surfaces validation for compute shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	1eca4c51a2	nvc0: bind images on compute shaders for Kepler Old surfaces validation code will be removed once images are completely done for Fermi/Kepler, that explains why I only disable it for now. This also introduces nvc0_get_surface_dims() which computes correct dimensions regarding the given target. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	c6b3c346d1	nvc0: reserve an area for surfaces info in the driver constbuf To process surfaces coordinates from the codegen part, and because some information like the format is not always available (eg. when writeonly is used), we have to stick some surfaces data in the driver constbuf. This is especially true for OpenCL because we don't know the format at shader compile time. This bumps the size of each shader area from 1K to 2K. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	afa04785fa	nvc0: add preliminary support for images This implements set_shader_images() and resource invalidation for images. As OpenGL requires at least 8 images, we are going to expose this minimum value even if this might be raised for Kepler, but this limit is mainly for Fermi because the hardware only accepts 8 images. Based on original patch by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	c62b1b92f7	gk110/ir: add emission for (a OP b) OP c This is pretty similar to NVC0 except that offsets have changed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	3da8528846	nvc0/ir: fix wrong emission of (a OP b) OP c The third source must be emitted at offset 49 instead of 17 and the not modifier is at 52 instead of 20. If you look a bit above in emitLogicOp() you will see that the dest is emitted at 17 which confirms that src(2) is obviously wrong. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-26 19:47:49 +02:00
Jose Fonseca	a2fe35bcdf	scons: Support Clang on Windows. - Introduce 'gcc_compat' env flag, for all compilers that define __GNUC__, (which includes Clang when it's not emulating MSVC.) - Clang doesn't support whole program optimization - Disable enumerator value warnings (not sure why Clang warns about them, as my understanding is that MSVC promotes enums to unsigned ints automatically.) This is not enough to build with Clang + AddressSanitizer though. More follow up changes will be required for that. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-26 17:17:00 +01:00
Jose Fonseca	dcc3baf733	gallium: Include intrin.h instead of defining ourselves. More portable, particularly when building with Clang, which implements all MSVC intrisincs in its own intrin.h, but doesn't actually support `#pragma instrinsic`. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-26 17:17:00 +01:00
Jose Fonseca	9a25c8af1b	scons: Whenever possible decide what to do based on platform and not compiler. Because compilers like GCC and Clang are effectively available everywhere so their presence/absence is seldom conclusive. Furthermore, all compilers we use now have stdint.h. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-26 17:17:00 +01:00
Jose Fonseca	c068610a7d	scons: Move fallback HAVE_* definitions to headers. These were being defined in SCons, but it's not practical: - we actually need to include Gallium headers from external source trees, with completely disjoint build infrastructure, and it's unsustainable to replicate the HAVE_xxx checks or even hard-coded defines across everywhere. - checking compiler version via command line doesn't really work due to Clang essentially being like a cameleon which can fake either GCC or MSVC There's no change for autoconf. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-26 17:17:00 +01:00
Juha-Pekka Heikkila	940da2ce0e	nir: Add missing break into switch in construct_value() There seemed to be missing one break in nested switchcases. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-26 17:45:56 +02:00
Bas Nieuwenhuizen	31631d8515	radeonsi: Fix memory leak in error path. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-26 15:41:19 +02:00
Oded Gabbay	514c5b5f4b	radeonsi: fix build error because of missing param Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-26 13:48:43 +03:00
Oded Gabbay	965175aba3	r600g: use do_endian_swap in texture swapping function For some texture formats we need to take "do_endian_swap" into account when configuring their swizzling. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-26 11:00:16 +03:00
Oded Gabbay	c86c761343	r600g: use do_endian_swap in color swapping functions For some formats we need to take "do_endian_swap" into account when configuring swapping for color buffers. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-26 11:00:16 +03:00
Oded Gabbay	686ad477bd	r600g: set endianess of 16/32-bit buffers according to do_endian_swap This patch modifies r600_colorformat_endian_swap(), so for 16-bit and for 32-bit buffers, the endianess configuration will be determined not only by the color/texture format, but also by the do_endian_swap parameter. The only exception is for array formats, which are always set to not do swapping, because for them gallium sets an alias based on the machine's endianess. v4: V_0280A0_COLOR_16_16 and V_0280A0_COLOR_16_16_FLOAT should be set to 8IN16 because the bytes inside need to be swapped even for array formats. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-26 11:00:16 +03:00
Oded Gabbay	2242dbe11d	r600g/radeonsi: send endian info to format translation functions Because r600 GPUs can't do swap in their DB unit, we need to disable endianess swapping for textures that are handled by DB. There are four format translation functions in r600g driver: - r600_translate_texformat - r600_colorformat_endian_swap - r600_translate_colorformat - r600_translate_colorswap This patch adds a new parameters to those functions, called "do_endian_swap". When running in a big-endian machine, the calling functions will check whether the texture/color is handled by DB - "rtex->is_depth && !rtex->is_flushing_texture" - and if so, they will send FALSE through this parameter. Otherwise, they will send TRUE. The translation functions, in specific cases, will look at this parameter and configure the swapping accordingly. v4: evergreen_init_color_surface_rat() is only used by compute and don't handle DB surfaces, so just sent hard-coded FALSE to translation functions when called by it. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-26 11:00:16 +03:00
Ilia Mirkin	4965c5bf72	glsl: add ability to use essl 3.20 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-25 23:40:54 -04:00
Ilia Mirkin	fa8c0ccfbc	main: select ES3.2 version when all extensions are available Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-25 23:40:34 -04:00
Dave Airlie	e3e6859381	tgsi: pass a shader type to the machine create and clean up. There was definitely bugs here mixing up the PIPE_ and TGSI_ defines, hopefully they didn't cause any problems, since mostly it was special cases for GEOMETRY. This clarifies at shader machine create what type of shader this machine will execute. This is needed also for compute shaders where we don't want to allocate inputs/outputs. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-26 13:05:32 +10:00
Dave Airlie	a6aae0c24d	gallium/tgsi: move tgsi_exec.h header out of draw_context.h It gets annoying that changing the tgsi exec rebuilds the state tracker unnecessarily. Putting this include into draw_gs.h which uses it causes a lot less rebuilds. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-26 13:00:57 +10:00
Roland Scheidegger	bd07e20d20	gallivm: make sampling more robust against bogus coordinates Some cases (especially these using fract for coord wrapping) did not handle NaNs (or Infs) correctly - the following code assumed the fract result could not be outside [0,1], but if the input is a NaN (or +-Inf) the fract result was NaN - which then could produce out-of-bound offsets. (Note that the explicit NaN behavior changes for min/max on x86 sse don't result in actual changes in the generated jit code, but may on other architectures. Found by looking through all the wrap functions.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94955 No piglit changes. (v2: fix min/max typo in coord_mirror, add comment) Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Tested-by: Bruce Cherniak <bruce.cherniak@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-26 04:55:37 +02:00
Dave Airlie	d8edc3e97c	radeonsi: fix missing include for Elements. Since u_blitter.h no longer defines this. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-26 09:36:23 +10:00
Samuel Pitoiset	d12c3b02ff	nvc0: bump the amount of shared memory per MP on Maxwell According to the CUDA compute capability version, GM10x can expose 64KB of shared memory while GM20x can use 96KB. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 00:32:25 +02:00
Dave Airlie	5b6a1aee46	r600: fix missing include for Elements macro This got removed from u_blitter.h and we were taking it from there, this should just move to ARRAY_SIZE eventually. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-26 08:01:01 +10:00
Samuel Pitoiset	725431a5db	gm107/ir: s/invalid load/invalid store/ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-25 23:55:52 +02:00
Rob Clark	d2fcd0ce38	freedreno/a3xx: remove unused fxn Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 17:10:14 -04:00
Rob Clark	8fe2076243	freedreno/ir3: convert over to ralloc The home-grown heap scheme (which is ultra-simple but probably not good to always allocate and memset such a chunk of memory up front) was a remnant of fdre (where the ir originally came from). But since we have ralloc in mesa, lets just use that instead. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 17:09:09 -04:00
Rob Clark	27cf3b0052	mesa/st: log some additional invalid-fbo cases Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 17:08:22 -04:00
Rob Clark	2c8674f5a9	freedreno: honor handle->offset Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:22 -04:00
Rob Clark	dfd23abdcc	freedreno: disallow cat4 immed src Normally this would never happen (constant-propagation in NIR would eliminate the instruction), except it does happen for 'undef' which we turn into immed 0.0 for bookkeeping purposes. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:21 -04:00
Rob Clark	76c6cdd36a	freedreno/a4xx: add render-target formats Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:21 -04:00
Rob Clark	7add166a5c	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:21 -04:00
Rob Clark	edcc6ce75d	freedreno: reduce line width for deqp further See a7eb12d0.. but that wasn't restrictive enough. Fixes dEQP-GLES3.functional.rasterization.primitives.line_strip_wide, and similar Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:21 -04:00
Rob Clark	4610e5ef28	freedreno/ir3: fix sin/cos We seem to need range reduction to get sane results. Fixes glmark2 jellyfish bench, and a whole bunch of dEQP-GLES3.functional.shaders.builtin_functions.precision.{sin,cos,tan}.* v2: squashed in android build fixes from Rob Herring Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:21 -04:00
Kenneth Graunke	21b4bcdd05	i965: Unroll SIMD16 DDY_FINE on Sandybridge. This fixes 10 dEQP-GLES3 subtests: dEQP-GLES3.functional.shaders.derivate.dfdy.texture.float_nicest.*. Matt noticed that our Piglit tests for this use even numbered registers, while the failing dEQP tests use odd numbered registers. We believe that it works for even numbered registers, but not otherwise. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-25 13:13:00 -07:00
Brian Paul	e915903c10	docs: update the instructions for getting a git account Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-25 14:10:40 -06:00
Brian Paul	ef3f00edd8	docs: update link to Intel's graphics website Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-25 14:10:40 -06:00
Jordan Justen	50b82ecd77	mesa/gles: Allow format GL_RED to be used with MESA_FORMAT_R_UNORM If the bound framebuffer has a format of MESA_FORMAT_R_UNORM, then IMPLEMENTATION_COLOR_READ_FORMAT will return GL_RED. This change applies to OpenGLES contexts where additional restrictions are placed on the formats that are allowed to be supported. Fixes OpenGLES 3.1 CTS tests: * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC16 * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC16Linear * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC32F * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC32FLinear Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-25 12:09:15 -07:00
Charmaine Lee	c4cb879f00	svga: eliminiate unnecessary constant buffer updates Currently if the texture binding is changed, emit_fs_consts() is triggered to update texture scaling factor for rectangle texture or texture buffer size in the constant buffer. But the update is only relevant if the texture binding includes a rectangle texture or a texture buffer. To eliminate the unnecessary constant buffer updates due to other texture binding changes, a new flag SVGA_NEW_TEXTURE_CONSTS will be used to trigger fragment shader constant buffer update when a rectangle texture or a texture buffer is bound. With this patch, the number of constant buffer updates in Lightsmark2008 reduces from hundreds per frame to about 28 per frame. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 12:59:29 -06:00
Charmaine Lee	686cd3c606	svga: mark the texture dirty for write transfer map only Instead of unconditionally mark the texture subresource dirty at transfer map, we'll set the dirty bit for write transfer only. Tested with lightsmark2008 and glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 12:59:29 -06:00
Charmaine Lee	676931640f	svga: fix assert with PIPE_QUERY_OCCLUSION_PREDICATE for non-vgpu10 With this patch, when running in hardware version 11, we'll use SVGA3D_QUERYTYPE_OCCLUSION query type for PIPE_QUERY_OCCLUSION_PREDICATE and return TRUE if samples-passed count is greater than 0. Fixes glretrace/solidworks2012_viewport running in hardware version 11. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 12:59:29 -06:00
Charmaine Lee	d7a6c1a476	svga: minimize surface flush Currently, we always do a surface flush when we try to establish a synchronized write transfer map. But if the subresource has not been modified, we can skip the surface flush. In other words, we only need to do a surface flush if the to-be-mapped subresource has been modified in this command buffer. With this patch, lightsmark2008 shows about 15% performance improvement. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 12:59:29 -06:00
Frederic Devernay	23949cdf2c	glapi: fix _glapi_get_proc_address() for mangled function names In the dispatch table, all functions are stored without the "m" prefix. Modify code so that OSMesaGetProcAddress works both with gl and mgl prefixes. Similar to https://lists.freedesktop.org/archives/mesa-dev/2015-September/095251.html Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94994 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 12:59:29 -06:00
Brian Paul	63df017fda	util/blitter: use ARRAY_SIZE macro And remove local definition of Elements() macro. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-25 12:59:29 -06:00
Brian Paul	e0184b3995	svga: s/Elements/ARRAY_SIZE/ Standardize on the later macro rather than a mix of both. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-04-25 12:59:29 -06:00
Brian Paul	77e4b41671	svga: whitespace and formatting fixes in svga_pipe_rasterizer.c	2016-04-25 12:59:29 -06:00
Brian Paul	25e0d3659f	svga: whitespace and formatting fixes in svga_pipe_depthstencil.c	2016-04-25 12:59:29 -06:00
Brian Paul	595fbc8dee	svga: whitespace and formatting fixes in svga_pipe_sampler.c	2016-04-25 12:59:29 -06:00
Brian Paul	1db8313168	gallium/util: initialize pipe_framebuffer_state to zeros To silence a valgrind uninitialized memory warning. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-25 12:59:29 -06:00
Brian Paul	1e990978ee	util/cache: add comments, fix formatting	2016-04-25 12:59:29 -06:00
Kenneth Graunke	4e2d22c5a7	i965: Mark URB reads as volatile. They can be affected by URB writes. In the upcoming scalar TCS backend, this prevents read-modify-write cycles from being broken by CSE removing reads. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-25 11:45:15 -07:00
Kenneth Graunke	501bedffa6	i965: Make a few tessellation related functions non-static. Also, move them to brw_shader.cpp so they're in a location for code used by both the vec4 and fs worlds. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-25 11:44:48 -07:00
Brian Paul	464d6080c6	svga: separate HUD counters for state objects Count depth/stencil, blend, sampler, etc. state objects separately but just report the sum for the HUD. This change lets us use gdb to see the breakdown of state objects in more detail. Also, count sampler views too. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-04-25 09:45:16 -06:00
Robert Foss	b87856d25d	st/omx: Fix resource leak on OMX_ErrorNone Avoid leaking buffer allocated for task if an error has occured. Coverity id: 1213929 Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 15:09:37 +01:00
Jonathan Gray	3c8f9ed9b7	isl: remove ffs function that conflicts with system headers Remove a wrapper around __builtin_ffs that conflicts with system headers on OpenBSD and perhaps elsewhere: isl_priv.h:44: error: conflicting types for 'ffs' v2: include strings.h to ensure prototype is found Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 15:06:46 +01:00
Grazvydas Ignotas	dc732a8ef2	gallium: use unreachable instead of asserts Avoids warnings in release builds. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:34 +02:00
Grazvydas Ignotas	d14778656b	anv: fix warnings in release build Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings in release build. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:31 +02:00
Grazvydas Ignotas	ff48375a16	isl: fix warnings in release build Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings in release build. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:28 +02:00
Grazvydas Ignotas	29d2c0e9e6	spirv: fix warning in release build Mark variable MAYBE_UNUSED to avoid unused-but-set-variable warning in release build. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:25 +02:00
Grazvydas Ignotas	cbb0d4ad75	gallium: fix warnings in release build Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings in release build. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:21 +02:00
Grazvydas Ignotas	bbeb9ab2f7	glsl: fix warning in release build Mark variable MAYBE_UNUSED to avoid unused-but-set-variable warning in release build. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:16 +02:00
Grazvydas Ignotas	e4fc06a2f8	util: add MAYBE_UNUSED for config dependent variables This is mostly for variables that are only used in asserts and cause unused-but-set-variable warnings in release builds. Could just use UNUSED directly, but MAYBE_UNUSED should be less confusing and is similar to what the Linux kernel has. And yes __attribute__((unused)) can be used on variables on both GCC 4.2 (oldest supported by mesa) and clang 3.0 (just some random old version, not sure what's the minimum for mesa). Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:10 +02:00
Hans de Goede	787a53988c	nouveau: codegen: combineLd/St do not combine indirect loads combineLd/St would combine, i.e. : st u32 # g[$r2+0x0] $r2 st u32 # g[$r2+0x4] $r3 into: st u64 # g[$r2+0x0] $r2d But this is only valid if r2 contains an 8 byte aligned address, which is not guaranteed for compute shaders This commit checks for src0 dim 0 not being indirect when combining loads / stores as combining indirect loads / stores may break alignment rules. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-25 11:45:07 +02:00
Rob Clark	0831eb94b9	freedreno/ir3: relax restriction in grouping Currently we were two restrictive, and would insert an output move in cases like: MOV OUT[0], IN[0].xyzw Loosen the restriction to allow the current instruction to appear in the neighbor list but only at it's current possition. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-24 13:40:57 -04:00
Rob Clark	36c9ea6e79	freedreno/ir3: fix small memory leak Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-24 13:40:57 -04:00
Rob Clark	610837fb98	freedreno/ir3: fix small RA bug Normally the offset in the group would be the same, but not always. For example, in a sam(w) which only writes the 4th component. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-24 13:40:57 -04:00
Rob Clark	adf795432f	freedreno/a4xx: better workaround for astc+srgb This seems like a hw bug, and maybe only applies to certain a4xx variants/revisions. But setting the SRGB bit in sampler view state (texconst0) causes invalid alpha for ASTC textures. Work around this setting up a second texture state and using that to sample alpha separately. This way, srgb->linear conversion happens in hw prior to interpolation. This fixes 546 dEQP tests: dEQP-GLES3.functional.texture.astcsrgb* Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-24 13:40:57 -04:00
Rob Clark	a148300b13	Revert "freedreno/a4xx: lower srgb in shader for astc textures" Better workaround in the following patch. This reverts commit `899bd63ace`.	2016-04-24 13:40:57 -04:00
Rob Clark	19118e6f47	freedreno/a4xx: blend state no longer depends on fb state Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-24 13:40:57 -04:00
Marek Olšák	c0c6ca40a2	Revert "st/dri: add 32-bit RGBX/RGBA formats" This reverts commit `ccdcf91104`. It breaks most KDE apps, because DRI doesn't support the RGBA component ordering. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95071	2016-04-24 15:16:07 +02:00
Jonathan Gray	147a2d25ad	genxml: use PYTHON3 Allows the build to work when the python3 binary is not "python3". v2: remove x bit from the script at Emil's suggestion Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 16:45:05 -07:00
Nanley Chery	710b1d2e66	i965/tex_image: Flush certain subnormal ASTC channel values When uploading a linear, void-extent, ASTC LDR block on Skylake, we are required to flush to zero the UNORM16 channel values that would be denormalized. This is specifically required for the values: 1, 2, and 3. Fixes the 14 failing tests in: dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.* v2: Split out flushing function (Kristian Høgsberg) v3: Map with READ instead of INVALIDATE (Kenneth Graunke) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 11:35:08 -07:00
Jonathan Gray	e29b3bfd6e	configure.ac: search for and set PYTHON3 src/intel/genxml/gen_pack_header.py requires python3. v2: check for python3.5 as well Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 01:06:20 -07:00
Topi Pohjolainen	f8dd07a2c3	i965/blorp: Enable for buffer resolves Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94181 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:29:15 +03:00
Topi Pohjolainen	c7cf17ae75	i965/blorp: Enable for normal color clears Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:29:15 +03:00
Topi Pohjolainen	c4ec0121a8	i965/blorp: Fix clear code for ignoring colormask for XRGB formats on Gen9+ This is equivalent of `73b01e2711` for blorp. v2 (Ken): No need to call _mesa_format_has_color_component() now that the number of components is gotten from _mesa_base_format_component_count(). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:29:15 +03:00
Topi Pohjolainen	19948f1bf6	mesa/formats: Take luminance into account in component count Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-23 07:29:15 +03:00
Topi Pohjolainen	9e153c0692	i965/blorp: Do not trigger re-emission of base state address In case blorp needs to configure it will be just as if render or compute pipeline had configured it. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:28:58 +03:00
Topi Pohjolainen	84db9ca3f7	i965/blorp: Reconfigure base state address only if needed Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:09:39 +03:00
Topi Pohjolainen	234b5f23f8	i965/blorp: Use BRW_NEW_BLORP instead of trashing all state bits Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:09:39 +03:00
Kenneth Graunke	6d5ce1b043	i965: Make all atoms to track BRW_NEW_BLORP by default Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com	2016-04-23 07:09:39 +03:00
Topi Pohjolainen	65a5af6dd0	i965: Introduce state flag for blorp In the past, BLORP has clobbered all BRW_NEW_* state flags, to trigger re-emission of the entire 3D pipeline on the next draw. However, there are some packets BLORP simply leaves alone, so there's no need to re-emit them. Trying to reduce the set of dirty bits flagged after BLORP runs is tricky. Instead, we introduce a BRW_NEW_BLORP flag. This should be set on any atom which emits a packet that BLORP also emits. When BLORP runs, it will flag BRW_NEW_BLORP, causing those packets to get re-emitted. This also makes it easy to avoid re-emitting specific atoms - we can simply drop the BRW_NEW_BLORP flag on those. To start, we assume that all packets need to be re-emitted. This is the safest approach and closest to the existing code's behavior. Many of these are obviously not required, and can be dropped in subsequent patches. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:09:39 +03:00
Topi Pohjolainen	0e850452d1	i965/blorp/gen6: Use normal base state address setup This is identical to the blorp version which only differs in case fragment shader isn't used. In that case blorp would reset batch buffer address to zero. This is not really needed, and having blorp to use base state address setup that is compatible with normal upload allows one to skip resetting it. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:09:39 +03:00
Topi Pohjolainen	ae73e86497	i965: Remove pointers to non-existing atoms Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:09:39 +03:00
Tom Stellard	9f110a9e10	radeonsi: Implement ddx/ddy on VI using ds_bpermute The ds_bpermute instruction allows threads to transfer data directly to or from the vgprs of other threads. These instructions use the LDS hardware to transfer data, but do not read or write LDS memory. DDX BEFORE: \| DDX AFTER: \| v_mbcnt_lo_u32_b32_e64 v2, -1, 0 \| v_mbcnt_lo_u32_b32_e64 v2, -1, 0 v_mbcnt_hi_u32_b32_e64 v2, -1, v2 \| v_mbcnt_hi_u32_b32_e64 v2, -1, v2 v_lshlrev_b32_e32 v4, 2, v2 \| v_and_b32_e32 v2, 60, v2 v_and_b32_e32 v2, 60, v2 \| v_lshlrev_b32_e32 v2, 2, v2 v_lshlrev_b32_e32 v3, 2, v2 \| ds_bpermute_b32 v3, v2, v0 s_mov_b32 m0, -1 \| ds_bpermute_b32 v0, v2, v0 offset:4 ds_write_b32 v4, v0 \| s_waitcnt lgkmcnt(0) s_waitcnt lgkmcnt(0) \| v_or_b32_e32 v0, 1, v2 \| v_lshlrev_b32_e32 v0, 2, v0 \| ds_read_b32 v1, v3 \| ds_read_b32 v0, v0 \| s_waitcnt lgkmcnt(0) \| \| LDS: 1 blocks \| LDS: 0 blocks Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-04-22 23:48:43 +00:00
Tom Stellard	128267d781	radeonsi: Use llvm.amdgcn.mbcnt.* intrinsics instead of llvm.SI.tid We're trying to move to more of the new style intrinsics with include the correct target name, and map directly to ISA instructions. v2: - Only do this with LLVM 3.8 and newer. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-22 23:48:43 +00:00
Tom Stellard	d3427412a3	radeonsi: Set range metadata on calls to llvm.SI.tid The range metadata tells LLVM the range of expected values for this intrinsic, so it can do some additional optimizations on the result. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-22 23:48:41 +00:00
Tom Stellard	b31422d970	radeonsi: Create a helper function for computing the thread id Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-22 23:45:34 +00:00
Nanley Chery	86cd9a134f	i965: Disable KHR_texture_compression_astc_hdr on Gen9 Although Gen9 samples from most HDR ASTC surfaces of correctly, there currently are no software workarounds to fix the incorrect sampling that occurs in others of certain color endpoint modes. With this change, we are no longer failing the 14 tests from: dEQP-GLES3.functional.texture.compressed.astc.endpoint_value_hdr_cem_15.* Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-22 16:57:38 -07:00
Tim Rowley	ec089cd987	swr: [rasterizer memory] Constify load tiles Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:49:20 -05:00
Tim Rowley	6facf4b74a	swr: [rasterizer core] CompleteDrawContext changes for gcc Add explicit inline and non-inline versions of CompleteDrawContext to make gcc happy. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:49:04 -05:00
Tim Rowley	0487377dce	swr: [rasterizer] Small cleanups Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:56 -05:00
Tim Rowley	2c4c3c9c71	swr: [rasterizer scripts] Knob scripts tweaks Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:47 -05:00
Tim Rowley	ef293ee9c0	swr: [rasterizer] Interpolation utility functions v2: use _mm_cmpunord_ps for vIsNaN Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:38 -05:00
Tim Rowley	27cc5924ea	swr: [rasterizer core] TemplateArgUnroller Switch boolean template arguments to typename template arguments of type std::integral_constant<bool, VALUE>. This allows the template argument unroller to easily be extended to enums. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:29 -05:00
Tim Rowley	46a448d161	swr: [rasterizer core] Arena: make most allocated blocks the same size Reduces sorting cost Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:20 -05:00
Tim Rowley	794be41f91	swr: [rasterizer core] Fix global arena allocator bug - Plus some minor code refactoring Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:11 -05:00
Tim Rowley	e42f00ee39	swr: [rasterizer core] Fix thread binding for 32-bit windows Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:47:59 -05:00
Tim Rowley	cd21f90ecf	swr: [rasterizer fetch] Add support for fetching non-uniform component formats For example, R10G10B10A2_UNORM. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:47:48 -05:00
Tim Rowley	244ae7af1b	swr: [rasterizer core] Use CS spill/fill size in core Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:47:02 -05:00
Tim Rowley	ee9621e2f5	swr: fix memory leaks from vs/fs compilation v2: varient -> variant Reviewed by: George Kyriazis <George.Kyriazis@intel.com>	2016-04-22 18:05:02 -05:00
Tim Rowley	5815c8b3d3	swr: fix clang warnings v2: use alternate logic version in swr_check_render_cond Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:03:41 -05:00
Rob Clark	e85bef8b12	freedreno/a4xx: fix encoding of blend color state Fixes a whole bunch of dEQP-GLES3.functional.fragment_ops.random.* (now they all pass) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-22 15:00:34 -04:00
Rob Clark	23abc41d2b	freedreno: update generated headers Pull in RB_BLEND_* fixes. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-22 15:00:34 -04:00
Eric Anholt	79b36168e0	vc4: Make sure we recompile when sample_mask changes. Part of fixing piglit EXT_framebuffer_multisample/sample-coverage inverted (there is also a bug with RCL tiled blits) Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-22 11:27:11 -07:00
Eric Anholt	876c647194	vc4: Fix validation of full res tile offset if used for non-MSAA. There's no reason we couldn't do non-MSAA full resolution tile buffer load/stores, but we would have claimed buffer overflow was being attempted. Nothing does this currently.	2016-04-22 11:27:11 -07:00
Eric Anholt	3fecaf0d0c	vc4: Only do MSAA FB operations if the FB is MSAA. I noticed this as a problem with ET:QW traces emitting coverage code when the framebuffer was supposed to be single sampled.	2016-04-22 11:27:11 -07:00
Eric Anholt	1410403e1e	vc4: Fix tests for format supported with nr_samples == 1. This was a bug from the MSAA enabling. Tests for surfaces with nr_samples==1 instead of 0 (generally GL renderbuffers) would incorrectly fail out. Fixes the ARB_framebuffer_sRGB piglit tests other than srgb_conformance. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-22 11:27:11 -07:00
Eric Anholt	6eabdb8959	vc4: Don't try to blit from MSAA surfaces with mismatched width to dst. I had made the previous blit fix non-MSAA only because I was thinking about how the hardware infers stride from the RENDERING_CONFIG packet. However, I'm also inferring the stride for both MSAA src and dst in vc4_render_cl.c from the width argument in the ioctl. Fixes 15 EXT_framebuffer_multisample piglit tests.	2016-04-22 11:27:11 -07:00
Kenneth Graunke	42dea145d9	i965: Disable channel expressions for scalar GS, TCS, TES. On Broadwell, I get the following shader-db statistics: Tessellation Control Shaders: total instructions in shared programs: 57327 -> 57012 (-0.55%) instructions in affected programs: 27334 -> 27019 (-1.15%) helped: 45 HURT: 0 total cycles in shared programs: 265692 -> 255188 (-3.95%) cycles in affected programs: 263122 -> 252618 (-3.99%) helped: 184 HURT: 26 Tessellation Evaluation Shaders: total instructions in shared programs: 23236 -> 23157 (-0.34%) instructions in affected programs: 2791 -> 2712 (-2.83%) helped: 27 HURT: 0 total cycles in shared programs: 151858 -> 149704 (-1.42%) cycles in affected programs: 151858 -> 149704 (-1.42%) helped: 101 HURT: 114 Geometry Shaders: Orbital Explorer goes from 6442 -> 6356 instructions. Two Shadow of Mordor shaders increase by a single instruction. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-22 10:26:30 -07:00
Topi Pohjolainen	1883613a24	i965/blorp: Add support for 2x msaa Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-22 17:02:29 +03:00
Topi Pohjolainen	125a7fdf32	i965/blorp: Add support for encoding/decoding interleaved 2x msaa Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-22 17:01:29 +03:00
Samuel Iglesias Gonsálvez	f70cacc4bd	i965: don't lower mod() in glsl ir NIR will lower it in nir_opt_algebraic. No change in shader-db. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-22 13:44:28 +02:00
Timothy Arceri	72b5d00c9c	glsl: fix cross validation for explicit locations on structs and arrays Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-22 20:59:57 +10:00
Nicolai Hähnle	39e9cf6cb1	radeonsi: implement TGSI_SEMANTIC_HELPER_INVOCATION Depends on LLVM support introduced in r267102. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 23:14:04 -05:00
Ilia Mirkin	2bac561787	swr: ignore generated files in rasterizer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-04-22 00:07:25 -04:00
Ilia Mirkin	88ca4a43a2	nvc0: fix retrieving query results into buffer for timestamps The timestamps are stored in a funny place, and even though they are a 64-bit result, are not stored with is64bit. Account for that when retrieving the query result into a resource. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-04-22 00:06:49 -04:00
Jason Ekstrand	541e6c0500	i965/surface_state: Use libisl functions for image format lowering This lets us delete some redundant code and keep all of the image_load_store format lowering logic in one place: libisl. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	e53cabe730	i965/fs_surface_builder: Use isl instead of mesa for format info Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	1831fa104c	i965/fs_surface_builder: Add a helper for converting GL to ISL formats Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	24bb75049b	i965/fs_surface_builder: Explicitly handle FORMAT_NONE in num_image_coordinates Previously, we were relying on has_matching_typed_format returning true for MESA_FORMAT_NONE which, in turn, relied on _mesa_get_format_bytes returning 1 for MESA_FORMAT_NONE. When we switch to ISL, this behaviour will no longer be something we can rely on. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	f310c02b94	i965/fs_surface_builder: Take a GL format enum instead of mesa_format Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	2980507a19	isl/format: Add a get_num_channels helper Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	3415cf5f2f	isl/format: Add more isl_format_has_type_channel functions Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	a4c04dd410	isl/format: Break the guts of has_[us]int_channel into a helper Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	ca8c5993bf	anv/image: Use the has_matching_typed_storage_image_format helper from isl Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	65bd8317e2	isl: Add a helper for determining when a typed load/store can be used Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	90576ac963	isl: Take a devinfo in lower_storage_image_format instead of an isl_device We want to call this function from the shader compiler and having a full isl_device available at that point isn't practical. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	37f6f21b1f	isl: Don't use designated initializers in the header C++ doesn't support designated initializers and g++ in particular doesn't handle them when the struct gets complicated, i.e. has a union. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	2785840586	isl: Include c99_compat.h We need the restrict keyword in isl.h Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	ef5dca2034	i965: Add a dependency on libisl To avoid build issues, ensure that you're running `make' at the top level and/or you've executed `make clean' beforehand. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Nicolai Hähnle	fe3b1e1448	radeon: handle query buffer allocation and mapping failures Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94984 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 22:33:12 -05:00
Nicolai Hähnle	b222580578	radeon: wire end_query return value to sw/hw_end Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 22:33:07 -05:00
Nicolai Hähnle	71f33a6f69	st/mesa: check return value of begin/end_query They can only indicate out of memory conditions, since the other error conditions are caught earlier. v2: fix error message in EndQuery Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-21 22:33:03 -05:00
Nicolai Hähnle	32214e0c68	gallium: add bool return to pipe_context::end_query Even when begin_query succeeds, there can still be failures in query handling. For example for radeon, additional buffers may have to be allocated when queries span multiple command buffers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 22:32:50 -05:00
Ben Widawsky	6a0d036483	i965: Always use Y-tiled buffers on SKL+ Starting with Skylake, the display engine is capable of scanning out from Y-tiled buffers. As such, we can and should use Y-tiling for better efficiency. This also has the added benefit of being able to fast clear the winsys buffer. Note that the buffer allocation done for mipmaps will already never allocate an X-tiled buffer for GEN9. This has an almost universal positive impact on benchmarks, some improving by as much as 20%. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 20:14:58 -07:00
Marek Olšák	c3b88cc2c1	softpipe: fix a warning due to an incorrect enum comparison no change in behavior, because both are defined the same Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-22 01:30:39 +02:00
Marek Olšák	c9e5a7df61	gallium: remove helpers converting to/from TGSI_PROCESSOR_* Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-22 01:30:39 +02:00
Marek Olšák	af249a7da9	gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_* Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-22 01:30:39 +02:00
Marek Olšák	fb523cb6ad	gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_* Use PIPE_SWIZZLE_* everywhere. Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE. The new enum is called pipe_swizzle. Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-22 01:30:39 +02:00
Marek Olšák	ed23335a31	gallium: use enums in p_shader_tokens.h (v2) Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Acked-by: Jose Fonseca <jfonseca@vmware.com> (v1) v2: name enums	2016-04-22 01:30:36 +02:00
Marek Olšák	0135bd44c2	gallium: use enums in p_defines.h (v2) and remove number assignments which are consecutive Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Acked-by: Jose Fonseca <jfonseca@vmware.com> (v1) v2: name enums	2016-04-22 01:30:34 +02:00
Marek Olšák	8cfc4cf76d	radeonsi: remove the shader parameter from si_set_ring_buffer not used anymore this is a follow-up to the RW buffer cleanup. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-22 01:14:14 +02:00
Marek Olšák	3cbd8cfc7a	radeonsi: decrease GS copy shader user SGPRs to 2 const buffers are no longer used since the clip plane const buffer was moved to RW buffers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:14 +02:00
Marek Olšák	3acaefb1bb	radeonsi: shorten slot masks to 32 bits Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:14 +02:00
Marek Olšák	0954d5e982	radeonsi: clean up shader resource limit definitions Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:14 +02:00
Marek Olšák	3138a28ff2	radeonsi: move default tess level constant buffer to RW buffers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:14 +02:00
Marek Olšák	302bec24bd	radeonsi: move sample positions constant buffer to RW buffers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Marek Olšák	860b658b97	radeonsi: move clip plane constant buffer to RW buffers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Marek Olšák	698821bda3	radeonsi: rework polygon stippling to use constant buffer instead of texture add it to the RW_BUFFERS descriptor array now the slot masks don't have to have 64 bits Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Marek Olšák	bb1e647ada	radeonsi: generalize si_set_constant_buffer this will be used in the next commit Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Marek Olšák	36261c29cd	radeonsi: make RW buffer descriptor array global, not per shader stage v2: also simplify invalidation of RW buffer bindings (squashed) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Marek Olšák	1378487fb4	radeonsi: rename and rearrange RW buffer slots - use an enum - use a unique slot number regardless of the shader stage (the per-stage slots will go away for RW buffers) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Roland Scheidegger	4ff8cbb0d8	gallivm: fix bogus argument order to lp_build_sample_mipmap function Screwed up since `0753b135f6`. (Only an issue with different min/mag filters, and then only in some cases, which is probably why it went unnoticed for quite a while. The effect should have simply been nearest mip filter instead of linear, iff min was nearest, mag was linear, and all pixels hit the mignifying path.) Fixes a bunch of dEQP failures. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-21 23:57:24 +02:00
Kenneth Graunke	73b01e2711	i965: Fix clear code for ignoring colormask for XRGB formats on Gen9+. In commit `cda886a485`, Neil made us stop advertising RGBX formats on Gen9+, as the hardware apparently no longer has working fast clear support for those formats. Instead, we just fall back to RGBA formats, and use SCS to override alpha to 1.0. This is fine, but had one unintended side effect: it made us fall back to slow clears when the color mask disables alpha. Normally, we ignore the color mask for non-existent channels. This includes alpha for XRGB formats as writing garbage to the X channel is harmless. But, now that we use RGBA, we think there's a real alpha channel, and can't do the optimization. To hack around this, check if _BaseFormat is GL_RGB and ignore alpha. Improves WebGL Aquarium performance on Skylake GT3e by about 50% by letting it use repclears instead of slow clears. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-21 12:01:49 -07:00
Iago Toral Quiroga	bdaa0e12a2	i965/blorp: Improve precission of blitting coordinates when clipping We do this in two steps: first we clip the dst rect and adjust the src rect accordingly. Then we do it the other way around. In both passes the adjustment part involves multiplying by a scale factor that can lead to a small precision loss. This is breaking a few dEQP tests. Specifically, the problem happens when we need to clip the same coordinate twice. For example, if srcX0 and dstX0 need both to be clipped we want to avoid the situation where we clip srcX0 first, then adjust dstX0 accordingly but then we realize that the resulting dstX0 still needs to be clipped, so we clip dstX0 and adjust srcX0 again. Each of these two passes can lead to precission loss. What we want to do here is detect the rect that leads to the largest clip (accounting for the scale factor involved), clip that rect and adjust the other one. With this we ensure that the adjusted coordinate does not need to be clipped again and we can skip a second pass, improving precision. Fixes the following 4 dEQP tests: dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_linear Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-04-21 10:43:39 -07:00
Bas Nieuwenhuizen	38f4cee3ff	radeonsi: Add config parameter to si_shader_apply_scratch_relocs. shader->config is not updated for compute kernels. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-04-21 19:36:19 +02:00
Matt Turner	1bc983cd64	glsl: Relax GLSL 1.10 float suffix error to a warning. Float suffixes are allowed in all subsequent GLSL specifications, and it's obvious what the user meant if they specify one. Accept it with a warning to avoid breaking applications, like Planeshift (although it looks like between 0.6.1 and 0.6.3 they might have removed the suffixes from their shaders). Reviewed-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:33:08 -07:00
Matt Turner	33565d6764	i965/fs: Readd opt_drop_redundant_mov_to_flags(). This reverts commit `b449366587`. I removed the pass thinking that it was now not useful, but that was not true. I believe I ran shader-db on HSW and saw no results, but HSW does not use the unlit centroid workaround code and as a result does not emit redundant MOV_DISPATCH_TO_FLAGS instructions. On IVB, the shader-db results are: total instructions in shared programs: 6650806 -> 6646303 (-0.07%) instructions in affected programs: 106893 -> 102390 (-4.21%) helped: 793 total cycles in shared programs: 56195538 -> 56103720 (-0.16%) cycles in affected programs: 873048 -> 781230 (-10.52%) helped: 553 HURT: 209 On SNB, the shader-db results are: total instructions in shared programs: 7173074 -> 7168541 (-0.06%) instructions in affected programs: 119757 -> 115224 (-3.79%) helped: 799 total cycles in shared programs: 98128032 -> 98072938 (-0.06%) cycles in affected programs: 1437104 -> 1382010 (-3.83%) helped: 454 HURT: 237 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-21 10:32:40 -07:00
Topi Pohjolainen	0020ca3c92	i965/blorp: Do not emit pma stall on gen9+ This was left out from the original gen8 upload introduction. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 20:18:51 +03:00
Tim Rowley	81c1c481ed	swr: add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT to get_param Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-21 11:32:09 -05:00
Emil Velikov	9dcb3dfb23	i965: automake: remove gratuitous "+" during variable assignment There is not initial assignment, thus appending to it does not work. Fixes: `b27c85c4c0` "i965: add build rule for brw_nir_trig_workarounds.c" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-21 16:48:34 +01:00
Rob Herring	1ba203a085	gbm: add GBM_FORMAT_XBGR8888 format support Add GBM_FORMAT_XBGR8888/__DRI_IMAGE_FORMAT_XBGR8888 format support which is needed for Android. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-21 14:45:56 +01:00
Rob Herring	ccdcf91104	st/dri: add 32-bit RGBX/RGBA formats Add support for 32-bit RGBX/RGBA formats which are preferred for Android. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-21 14:45:53 +01:00
Rob Herring	3b69076435	dri/common: add MESA_FORMAT_R8G8B8{A8, X8}_UNORM formats as supported configs Add MESA_FORMAT_R8G8B8A8_UNORM and MESA_FORMAT_R8G8B8X8_UNORM formats as these are the preferred formats for Android. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-21 14:45:21 +01:00
Rob Herring	b27c85c4c0	i965: add build rule for brw_nir_trig_workarounds.c on Android Commit `bfd17c76c1` ("i965: Port INTEL_PRECISE_TRIG=1 to NIR.") added a generated file brw_nir_trig_workarounds.c which broke the Android build. Add the necessary makefiles to the Android build. Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-21 14:43:26 +01:00
Rob Herring	30239ba056	glsl: android: add back missing generated glcpp include path Commit `4db8f15a25` ("glsl: move the android build scripts a level up") dropped a generated include path for glcpp. Add it back adjusting for the new location. Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-21 14:43:21 +01:00
Jonathan Gray	28e3ae344b	loader: add a libdrm case for loader_get_device_name_for_fd Use dev_node_from_fd() with HAVE_LIBDRM to provide an implmentation of loader_get_device_name_for_fd() for non-linux systems that use libdrm but don't have udev or sysfs. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-21 14:41:41 +01:00
Jonathan Gray	5d09394fb1	i965/tiled_memcpy: don't unconditionally use __builtin_bswap32 Use the defines Mesa configure sets to indicate presence of the bswap32 builtins. This lets i965 work on OpenBSD again after the changes that were made in `0a5d8d9af4`. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-21 14:41:41 +01:00
Jonathan Gray	9bbf3737f9	egl/x11: authenticate before doing chipset id ioctls For systems without udev or sysfs that use drm ioctls in the loader drm authentication must take place earlier or the loader will fail "MESA-LOADER: failed to get param for i915". Patch from Mark Kettenis. Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mark Kettenis <kettenis@openbsd.org> Signed-off-by: Jonathan Gray <jsg@jsg.id.au> [Emil Velikov: remove gratuitous white-space] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-21 14:40:44 +01:00
Bas Nieuwenhuizen	4abe051a3f	gallium/radeon: Silence possibly uninitialized variable warning. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 13:40:47 +02:00
Bas Nieuwenhuizen	51d1551241	winsys/amdgpu: Silence possibly uninitialized variable warning. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 13:40:42 +02:00
Bas Nieuwenhuizen	4d13c7c879	radeonsi: Enable loading into CE RAM. We need to enable a bit in the CONTEXT_CONTROL packet for the loads to work. v2: Style issues. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-21 12:50:58 +02:00
Bas Nieuwenhuizen	f45f54e14a	radeonsi: Use defines for CONTEXT_CONTROL instead of magic values. v2: Use field names provided by Nicolai. v3: Updated to use CONTEXT_CONTROL prefix. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 12:50:58 +02:00
Thomas Hindoe Paaboel Andersen	d4a21a0de0	winsys/amdgpu: fix preamble IB size The missing break caused the IB size to be overwritten with the size of IB_CONST. This was introduced in: `7201230582` Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 12:14:50 +02:00
Topi Pohjolainen	935ce14a44	i965/blorp: Reduce the urb size requirement for vertex buffer Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	26fdb7e51e	i965/blorp: Reduce the size of vertex buffer Previously the vertex buffer consisted of eight floats per vertex of which six where constants. These can be as easily provided by vertex fetcher as it is capable of filling vertex elements with constant one and zero. This reduces the size of the vertex buffer from 3 * 8 * 4 = 96 to 3 * 2 * 4 = 24 bytes. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	0ae360f098	i965/blorp: Do not tricker urb re-configuration unnecessarily Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	69dfb7b2b7	i965/blorp: Skip re-emitting urb config whenever possible Otherwise clearing with blorp will regress performance in some synthetic test cases. v2: Used vsize >= 2 instead of vsize > 0, and updated the comment. Review by Ken in one of the earlier patches revealed this. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	7644e8ab68	i965/blorp: Prepare to switch from compute pipeline Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	aa322f8ae5	i965/blorp: Skip uploading state/options not needed for clears In case there is no source it means the program does a simple clear or a resolve. In such case there is no need to program sampling state or enable pixel kill in fragment shader. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	87d333f2fe	i965/blorp: Re-introduce clear programs This partially reverts `2f28a0dc23` Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	69c364f2dc	i965/meta: Move check for srgb into is_color_fast_clear_compatible() Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	8a696e75d8	i965/meta: Expose check for fast clear compatibility Also add the additional render format check to the same utility. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	a848ad6806	i965/meta: Expose fast clear value setup Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	fb14a2fc78	i965/meta: Expose non-fast clear rectangle calculation Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	9d79235e4e	i965/meta: Expose resolve clear rectangle calculation Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	2757d723da	i965/meta: Expose fast clear rectangle calculation Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	3ef957e783	i965: Declare input to mcs alignment calculation constant Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	c40b1efa70	i965/blorp: Switch the order of render and texture targets On gen8 color resolving won't work anymore if the target isn't the first entry in the binding table. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	0d062d79c3	i965/blorp: Reduce scope for generator and its inputs Generator is only needed for getting the assembly. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	4c3de6b2d6	i965/blorp: Add support for disabling color blending Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	da5a477ce4	i965/blorp: Add support for setting fast clear operation Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	7de72f728b	i965/blorp: Enable blits on gen8 v2 (Ken): Moved switch cases for gen8/9 in texel_fetch() to earlier patch adding gen8/9 sampling support. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	f7ab4e0cc4	i965/blorp: Prepare stencil sampling for gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	708453952b	i965/blorp: Add check for supported sample numbers v2 (Ken): Fix the condition on using meta for stencil blits: use_blorp -> !use_blorp Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:01 +03:00
Topi Pohjolainen	9e4d19372b	i965/blorp: Add support for sampling 3D textures This patch adds additional MOV instruction for all blorp programs that use SHADER_OPCODE_TXF. Alternative is to augment blorp program key to tell if z-coordinate is needed, add condition to the blorp blit compiler and to produce a variant with and without the MOV. This seems a little overkill. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:01 +03:00
Topi Pohjolainen	6b33d63d77	i965/blorp: Add support for source swizzle In order to support cases where gen9 uses RGBA format to back client requested RGB, one needs to have means to force alpha channel to one when user requested RGB surface is used as blit source. v2 (Ken): Use helper for constructing the swizzle (this should be changed to use brw_get_texture_swizzle() as a follow-up). Also calculate the swizzle for CopyTexSubImage. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:01 +03:00
Topi Pohjolainen	52e7008a5a	i965/blorp: Pipeline upload support for gen8 v2 (Ken): Drop GEN8_RASTER_FRONT_WINDING_CCW in raster state Add emission of pma stall. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:01 +03:00
Topi Pohjolainen	2fda441371	i965/gen8: Expose pma stall emission Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:19:30 +03:00
Topi Pohjolainen	8b2332e3d1	i965: Allow texture surface state setup to be used by blorp Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:42:10 +03:00
Topi Pohjolainen	0ad83d222b	i965/blorp: Prepare sampling for gen9 v2 (Ken): Added switch cases for gen8/9 in texel_fetch(). These were wrongly introduced in blit-enabling patch. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:41:40 +03:00
Topi Pohjolainen	328ab6c268	i965/blorp: Prepare render target write for gen8 v2 (Ken): Use payload directly instead of retyping it into vec8. Drop the implied header, it isn't used for gen6+ anyway. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:40:33 +03:00
Topi Pohjolainen	135f00e666	i965/blorp/gen6: Prepare vertex buffer setup logic for gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:37:06 +03:00
Topi Pohjolainen	395abb9c3b	i965/blorp/gen7: Expose state setup applicable to gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:36:53 +03:00
Topi Pohjolainen	ede09e672a	i965/blorp: Use 8k chunk size for urb allocation Previously, we hardcoded "VS URB Starting Address" to 2 (in 8kB chunks), which meant VS URB data would start at an offset of 16kB. However, on Haswell GT3 and Gen8+, we allocate the first 32kB for the push constant region. This means that the PS push constant and VS URB data regions overlap, which can lead to corruption. v2 (Ken): Better description of the change, and do not change vs_size from 2 to 1. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:36:26 +03:00
Topi Pohjolainen	e04b3cdf33	i965/blorp/gen7: Prepare re-using for gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:36:14 +03:00
Topi Pohjolainen	f1ddfa8512	i965/blorp: Let compiler calculate the vertex buffer size Currently the size is sizeof(float) times too large. One reserves GEN6_BLORP_VBO_SIZE many floats whereas GEN6_BLORP_VBO_SIZE stands for the size of vertex buffer in bytes. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:35:58 +03:00
Topi Pohjolainen	4c526370ca	i965/gen8: Expose state base address setup Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:35:45 +03:00
Topi Pohjolainen	9949103756	i965/gen8: Expose surface state helpers Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:35:34 +03:00
Topi Pohjolainen	4f1d9f2879	i965/gen9: Use correct size for DS_STATE Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:32:12 +03:00
Roland Scheidegger	0295db2a8b	glsl: add forgotten textureOffset function for sampler2DArrayShadow This was part of EXT_gpu_shader4 - as such it should have been supported by glsl 130. It was however forgotten, and not added until glsl 430 - with the wrong syntax no less (glsl 430 mentions it was overlooked). glsl 440 (but revision 8 only) fixed this finally for good. At least nvidia supports this with just version glsl version 1.30 as well (the spec doesn't explicitly say it should be supported retroactively), so just add this to the other glsl 130 textureOffset functions. Passes a (hacked) piglit tex-miplevel-selection test (2DArrayShadow textureOffset -auto) with llvmpipe. v2: fix up comment (by Ian), add testing to commit message. Reviewed-by: Dave Airlie <airlied@gmail.com>	2016-04-21 02:38:46 +02:00
Kenneth Graunke	d8c8f4203f	i965: Fix interpolateAtSample() on single sampled buffers. Fixes dEQP-GLES31.functional.shaders.multisample_interpolation tests: - interpolate_at_sample.non_multisample_buffer.sample_n_default_framebuffer - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_rbo - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_texture Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	447d3eec6a	i965: Fix gl_SampleMaskIn[] in per-sample shading mode. The coverage mask is not sufficient - in per-sample mode, we also need to AND with a mask representing the samples being processed by the current fragment shader invocation. Fixes 18 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask_in.bit_count_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bit_count_per_two_samples.multisample_{rbo,texture}_{4,8} sample_mask_in.bits_unique_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bits_unique_per_two_samples.multisample_{rbo,texture}_{4,8} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	66a725570c	i965: Only enable oMask output when there's a multisample FBO. The ARB_sample_shading specification says that setting gl_SampleMask bits to 0 means that the corresponding sample "should be considered uncovered for the purposes of multisample fragment operations (Section 4.1.3)." The OpenGL 4.4 specification, section 17.3.3 ("Multisample Fragment Operations") specifies: "No changes to the fragment alpha or coverage values are made at this step if MULTISAMPLE is disabled, or if the value of SAMPLE_BUFFERS is not one." oMask output alters coverage masks and can kill pixels. We need to disable it in the above case, which conveniently corresponds to key->multisample_fbo being false. Khronos bug #12188 also spells this out clearly: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=12188 Fixes two Piglit tests: tests/spec/arb_sample_shading/builtin-gl-sample-mask-simple 0 tests/spec/arb_sample_shading/builtin-gl-sample-mask 0 Fixes 21 ES3 conformance tests: ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_0 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_1 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_7 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_4 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_5 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_7 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_4 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_6 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_0 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_5 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_7 Fixes 9 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask.discard_half_per_pixel.default_framebuffer sample_mask.discard_half_per_pixel.singlesample_rbo sample_mask.discard_half_per_pixel.singlesample_texture sample_mask.discard_half_per_sample.default_framebuffer sample_mask.discard_half_per_sample.singlesample_rbo sample_mask.discard_half_per_sample.singlesample_texture sample_mask.discard_half_per_two_samples.default_framebuffer sample_mask.discard_half_per_two_samples.singlesample_rbo sample_mask.discard_half_per_two_samples.singlesample_texture Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	81407531e0	i965: Generalize wm_key->compute_sample_id to wm_key->multisample_fbo. I'm going to need a key entry meaning "we have a multisample FBO, and multisampling is enabled" in an upcoming patch. This is basically wm_key->compute_sample_id, except that it also checks that the SAMPLE_ID system value is read. The only use of wm_key->compute_sample_id is in emit_sampleid_setup(), which is only called when handling the SAMPLE_ID system value. So we can just eliminate the check and generalize the field. v2: Also update the Vulkan driver. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	de0a46a040	i965: Delete now dead persample_2x FS program key flag. This was only used by the old gl_SampleID calculations. The new code doesn't need to handle 2x specially. v2: Delete it from the Vulkan driver, too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	57118a19da	i965: Simplify gl_SampleID setup on Gen8+. On Gen7+, the thread payload provides the sample ID - we can read it in two instructions, without any elaborate calculations. We don't even need a state dependency - this will properly produce zero in the non-MSAA case. Unfortunately, we need the state flag anyway, so we may as well continue to use it to produce a single MOV 0 instead of SHR/AND. For some reason, the sample ID field is always zero on Gen7/7.5, so we can't use this yet. However, it works fine on Gen8+. So, land the code and use it where it's working, and leave a TODO for later. v2: Fix register types in the comment (caught by Matt Turner!). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	528255b0b1	i965: Flip key->compute_sample_id check. This just moves the simple case first. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Bas Nieuwenhuizen	43ed1f73f8	st/mesa: Use correct size for compute CAPs. Some CAPs are stored as 64-bit value while Mesa stores the related constant as 32-bit value. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-21 00:27:01 +02:00
Kenneth Graunke	60a17d0718	i965: Properly handle integer types in opt_vector_float(). Previously, opt_vector_float() always interpreted MOV sources as floating point, and always created a MOV with a F-type destination. This meant that we could mess up sequences of integer loads, such as: mov vgrf6.0.x:D, 0D mov vgrf6.0.y:D, 1D mov vgrf6.0.z:D, 2D mov vgrf6.0.w:D, 3D Here, integer 0/1/2/3 become approximately 0.0f, so we generated: mov vgrf6.0:F, [0F, 0F, 0F, 0F] which is clearly wrong. We can properly handle this by converting integer values to float (rather than bitcasting), and emitting a type converting MOV: mov vgrf6.0:D, [0F, 1F, 2F, 3F] To do this, see first see if the integer values (converted to float) are representable. If so, we use a D-type MOV. If not, we then try the floating point values and an F-type MOV. We make zero not impose type restrictions. This is important because 0D would imply a D-type MOV, but is often used in sequences such as MOV 0D, MOV 0x3f800000D, where we want to use an F-type MOV. Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend. This recently became visible due to changes in opt_vector_float() which made it optimize more cases, but it was a pre-existing bug. Apparently it also manages to turn more integer loads into VFs, producing the following shader-db statistics on Haswell: total instructions in shared programs: 7084195 -> 7082191 (-0.03%) instructions in affected programs: 246027 -> 244023 (-0.81%) helped: 1937 total cycles in shared programs: 65669642 -> 65651968 (-0.03%) cycles in affected programs: 531064 -> 513390 (-3.33%) helped: 1177 v2: Handle the type of zero better. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Kenneth Graunke	1aa28f3509	i965: Make opt_vector_float() only handle non-type-conversion MOVs. We don't handle this properly - we'd have to perform the type conversion before trying to convert the value to a VF. While we could do that, it doesn't seem particularly useful - most vector loads should be consistently typed (all float or all integer). As a special case, we do allow type-converting MOVs of integer 0, as it's represented the same regardless of the type. I believe this case does actually come up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Kenneth Graunke	2a25a5142b	i965: Fold vectorize_mov() back into the one caller. After the previous patch, this helper is only called in one place. So, just fold it back in - there are a lot of parameters here and not much code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Kenneth Graunke	9967561158	i965: Rework opt_vector_float() control flow. This reworks opt_vector_float() so that there's only one place that flushes out any accumulated state and emits a VF. v2: Don't break the sequence for non-representable numbers - just skip recording their values. Only break it for non-MOVs or register changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Jason Ekstrand	50018522d2	anv: s/anv_batch_emit_blk/anv_batch_emit/ Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	0a45395902	anv: Remove the old emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	86c52bc757	anv/gen7_pipeline: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	744e133431	anv/gen7_cmd_buffer: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	cae2f14947	anv/device: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	932c353592	anv/state: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	9e9f3f4e71	anv/gen8_pipeline: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	dba3727bea	anv/genX_pipeline: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	a48f8340d9	anv/gen8_cmd_buffer: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	8a6ced83e9	anv/cmd_buffer: Use the new emit macro for quaries Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	db25e1eec5	anv/cmd_buffer: Use the new emit macro for DRAWING_RECTANGLE Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	deb13870d8	anv/cmd_buffer: Use the new emit macro for compute shader dispatch Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	06fc7fa684	anv/cmd_buffer: Use the new emit macro for 3DSTATE_CONSTANT Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	a71ded0e18	anv/cmd_buffer: Use the new emit macro for DEPTH/STENCIL_BUFFER Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	56453eeaff	anv/cmd_buffer: Use the new emit macro for PIPE_CONTROL and STATE_BASE_ADDRESS Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	1d4d6852b4	anv/cmd_buffer: Use the new emit macro for 3DPRIMITIVE commands Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	64ad2d3bcd	anv: Add a new block-based batch emit macro This new macro uses a for loop to create an actual code block in which to place the macro setup code. One advantage of this is that you syntatically use braces instead of parentheses. Another is that the code in the block doesn't even get executed if anv_batch_emit_dwords fails. Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Samuel Pitoiset	d30768025a	gk110/ir: make use of IMUL32I for all immediates Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-20 22:55:36 +02:00
Samuel Pitoiset	17a37c78fc	gk110/ir: do not overwrite def value with zero for EXCH ops This is only valid for other atomic operations (including CAS). This fixes an invalid opcode error from dmesg. While we are it, make sure to initialize global addr to 0 for other atomic operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-20 22:55:33 +02:00
Marcin Ślusarz	3caf2e89aa	anv: fix build without Wayland platform Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 11:12:10 -07:00
Laurent Carlier	6c952d8ac7	anv: fix building on i686 with -mcpu=generic mcpu=generic doesn't enable sse2, and anvil definitly needs it Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 10:48:11 -07:00
Jason Ekstrand	2ef7aef322	spirv: Trivially handle the NonWriteable decoration Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 10:33:23 -07:00
Connor Abbott	b6dc940ec2	nir: rename nir_foreach_block() to nir_foreach_block_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 09:47:05 -07:00
Samuel Pitoiset	7143068296	nvc0: avoid tex read fault from compute shaders on GK110 After some investigation, it seems like that disabling the UNK02C4 command avoid a read fault with texelFetch() from a compute shader. I have no clue on what this method actually does, but this avoid the GPU to hang with basic-texelFetch.shader_test without introducing any compute-related regressions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-20 18:28:47 +02:00
Jason Ekstrand	87a4fb516e	i965/vec4: Always split uniforms in array_access_to_pull_constants Normally, we split uniforms at the end but in Vulkan, we bail because we don't want pull constants. However, we still need them split because pack_uniforms relies on it. I really don't like this patch not because it doesn't work (it does) but because now that we're using MOV_INDIRECT, uniform numbers and sizes don't really matter anymore. In the FS backend, uniform splitting and packing is handled all at once (actual re-assignment of locations happens later) and we really should do it that way in vec4 eventually as well. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001	2016-04-20 09:15:01 -07:00
Jason Ekstrand	b3f43822c7	i965/vec4: Use the correct offset for the swizzle shift in push constants This was actually caught by Ken in review the first time around but somehow didn't get fixed before the patches were pushed. :-( Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001	2016-04-20 09:15:01 -07:00
Jason Ekstrand	9f16e170fe	i965/vec4: Use nir_intrinsic_base in the load_uniform implementation We shouldn't be reading the const_index directly Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001	2016-04-20 09:15:01 -07:00
Jason Ekstrand	f63a95080f	anv/apply_dynamic_offsets: Provide a range on the load_uniform Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001	2016-04-20 09:14:58 -07:00
Jason Ekstrand	35b758c378	anv/lower_push_constants: Stop treating scalar specially All of the code that did something special based on vec4 vs. scalar is bogus. In the backend, everything is now in units of bytes and the vec4 backend can handle full std140 packing so we don't need to do anything special anymore. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998	2016-04-20 09:14:47 -07:00
Tim Rowley	3bbe8a09ea	swr: fix resource backed constant buffers Code was using an incorrect address for the base pointer. v2: use swr_resource_data() utility function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94979 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Tested-by: Markus Wick <markus@selfnet.de>	2016-04-20 09:57:55 -05:00
Hans de Goede	2ac2ecdd6c	nouveau: codegen: Add support for OpenCL global memory buffers Add support for OpenCL global memory buffers, note this has only been tested with regular load and stores and likely needs more work for e.g. atomic ops. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.arb_shader_storage_buffer_object.' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.arb_compute_shader.' results/shader [20/20] skip: 4, pass: 16 \| Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-20 13:46:03 +02:00
Hans de Goede	61d52a5fb9	nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for OpenCL global buffers. This commits changes the buffer code to use FILE_MEMORY_BUFFER at the ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL for use with OpenCL global buffers. Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL register file. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.arb_shader_storage_buffer_object.' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.arb_compute_shader.' results/shader [20/20] skip: 4, pass: 16 \| Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-20 13:46:03 +02:00
Jose Fonseca	f02f4d09ce	scons: Build dri_common_interop.c.	2016-04-20 12:41:24 +01:00
Marek Olšák	4fa3d35cc5	st/dri: implement the GL interop DRI extension (v2.2) v2: - set interop_version - simplify the offset_after macro v2.1: - use version numbers, remove offset_after - set "out_driver_data_written" v2.2: - set buf_offset & buf_size for GL_ARRAY_BUFFER too - add whandle.offset to buf_offset - disable the minmax cache for GL_TEXTURE_BUFFER	2016-04-20 12:18:47 +02:00
Marek Olšák	37d3a26bd6	glx: implement GLX part of interop interface (v2) v2: - use const	2016-04-20 12:18:47 +02:00
Marek Olšák	b6eda70843	egl: implement EGL part of interop interface (v2) v2: - use const	2016-04-20 12:18:47 +02:00
Marek Olšák	5e9ed261ed	dri_interface: add interface for GL interop with other APIs (v2) v2: - use const	2016-04-20 12:18:47 +02:00
Marek Olšák	6eeb729490	include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v4.2) v2: - use "enum" to define stuff v3: - more comments, define MESA_GLINTEROP_UNSUPPORTED v4: - add mesa_glinterop_device_info::interop_version - more comments - remove #define MESA_GLINTEROP_VERSION - use const for "in" v4.1: - use version numbers for structures - add "out_driver_data_written" v4.2: - buf_offset & buf_size affect GL_ARRAY_BUFFER too, this is required for sharing suballocations within a larger buffer	2016-04-20 12:15:41 +02:00
Nicolas Dufresne	8093990ef4	st/dri: Fix RGB565 EGLImage creation When creating egl images we do a bytes to pixel conversion by deviding by 4 regardless of the pixel format. This does not work for RGB565. In this patch, we avoid useless conversion and use proper API when the conversion cannot be avoided. Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-04-20 17:55:30 +09:00
Nicolas Dufresne	4463f38766	st/dri: Factor out DRI2 to PIPE_FORMAT conversion This code is already duplicated twice and will be useful again. This will also help when adding formats. Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-04-20 17:34:03 +09:00
Rob Clark	899bd63ace	freedreno/a4xx: lower srgb in shader for astc textures This seems like a hw bug, and maybe only applies to certain a4xx variants/revisions. But setting the SRGB bit in sampler view state (texconst0) causes invalid alpha for ASTC textures. Work around this by doing the srgb->linear conversion in the shader instead. This fixes 392 dEQP tests: dEQP-GLES3.functional.texture.astcsrgb* (The remaining fails seem to be a bug w/ ASTC + linear filtering, also possibly a420.0 specific.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 17:14:04 -04:00
Rob Clark	eddfc97709	nir/lower-tex: add srgb->linear lowering Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-19 17:13:50 -04:00
Rob Clark	eb00a0fc58	nir/builder: const'ify swiz param No need for it not to be const, and lets caller declare it const if desired. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-19 17:13:36 -04:00
Rob Clark	52ccc6349f	nir/lower-tex: make options a local var Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:12:49 -04:00
Rob Clark	d4ff42bd0a	freedreno: cleanup fd_set_sampler_views The separate FS/VS entrypoints are no longer used since `a3ed98f`. So just inline them. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:11:47 -04:00
Russell King	fadfaa82c6	tgsi/lowering: improved lowering for LRP Provide an improved lowering for LRP, which can be implemented in two MAD instructions with a bit of rearranging of the equation, rather than the literal implementation of two multiplies, an add and a subtract. Signed-off-by: Russell King <rmk@arm.linux.org.uk> Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:04:44 -04:00
Russell King	67da7dd98a	tgsi/lowering: improved lowering for XPD Improve XPD lowering to consume less instructions by using the MAD instruction to perform the multiply and subtraction together. Signed-off-by: Russell King <rmk@arm.linux.org.uk> Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:04:44 -04:00
Russell King	65460cf4c8	tgsi/lowering: add support for lowering TRUNC Add support for lowering TRUNC using the following sequence: FRC tmpA, \|src\| SUB tmpA, \|src\|, tmpA CMP dst, -tmpA, tmpA Note that this is incompatible with FRC lowering. Signed-off-by: Russell King <rmk@arm.linux.org.uk> Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:04:44 -04:00
Russell King	23e870a888	tgsi/lowering: add support for lowering FLR and CEIL Add support for lowering FLR and CEIL to FRC/SUB and FRC/ADD instructions for GPUs that support FRC but not FLR or CEIL. Since these uses FRC, it is invalid to ask for FLR or CEIL to be lowered along with FRC, so add an assert to catch this invalid configuration. We also need to deal with FLR instructions emitted by the lowering code. Fix these up with the FRC+SUB equivalent when FLR lowering is enabled. Signed-off-by: Russell King <rmk@arm.linux.org.uk> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:04:44 -04:00
Bas Nieuwenhuizen	464cef5b06	radeonsi: enable TGSI support cap for compute shaders v2: Use chip_class instead of family. v3: Check kernel version for SI. v4: Preemptively allow amdgpu winsys for SI. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:31:23 +02:00
Bas Nieuwenhuizen	1f32d5d59f	radeonsi: Consider input SGPR count for compute shader SGPR count. si_shader_create corrects the SGPR count with si_fix_num_sgprs. We then recompute the rsrc1 register to use the new SGPR count. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:31:23 +02:00
Bas Nieuwenhuizen	6c833ba1ab	radeonsi: Add CE synchronization for compute dispatches. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:31:23 +02:00
Bas Nieuwenhuizen	e0b729c544	mesa/st: enable compute shaders if images are also supported v2: Also depend on atomic counters. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:32 +02:00
Bas Nieuwenhuizen	41d79bcbfa	radeonsi: clean up compute flush Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:32 +02:00
Bas Nieuwenhuizen	7a92c08428	radeonsi: do not do two full flushes on every compute dispatch v2: Add more CS_PARTIAL_FLUSH events. Essentially every place with waits on finishing for pixel shaders also has a write after read hazard with compute shaders. Invalidating L2 waits implicitly on pixel and compute shaders, so, we don't need a CS_PARTIAL_FLUSH for switching FBO. v3: Add CS_PARTIAL_FLUSH events even if we already have INV_GLOBAL_L2. According to Marek the INV_GLOBAL_L2 events don't wait for compute shaders to finish, so wait for them explicitly. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	e764ee13ae	radeonsi: split setting graphics and compute descriptors Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	061ce9399a	radeonsi: split texture decompression for compute shaders Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	e56514f631	radeonsi: update predicate condition for compute dispatches Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	c3083d841e	radeonsi: implement TGSI compute dispatch v2: - Use radeon_set_sh_reg_seq. - Set predicate bit for conditional rendering. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	1349dd16ff	radeonsi: only emit compute shader state when switching shaders v2: - Do check if anything changed earlier - Use emitted_program instead of emitted_bo to prevent shaders with shader->bo = NULL confusing the check - Use radeon_set_sh_reg* Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	ba1f66a73d	radeonsi: rework compute scratch buffer Instead of having a scratch buffer per program, have one per context. Also removed the per kernel wave count calculations, but that only helped if the total number of waves in the dispatch was smaller than sctx->scratch_waves. v2: Fix style issue. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	107f4d3538	radeonsi: do per cs setup for compute shaders once per cs Also removes PKT3_CONTEXT_CONTROL as that is already being done by si_begin_new_cs, when emitting init_config. v2: - Use radeon_set_sh_reg_seq. - Also set COMPUTE_STATIC_THREAD_MGMT_SE2 / SE3 for CIK+ Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	52d3584dec	radeonsi: don't pass scratch buffer to user SGPRs As far as I can see we use relocations for clover too. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	422a19f76f	radeonsi: split input upload off from si_launch_grid Also uses a dynamically allocated buffer using u_upload_alloc. The old buffer per program approach required serializing all dispatches of the same program. v2: - Clarified commit message. - Use radeon_set_sh_reg_seq. - Also upload input buffer for clover kernels, even when input_size is 0, as it contains grid parameters. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	898298efc9	radeonsi: implement TGSI compute shader creation v2: Moved scratch_enabled initialization after compile. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	85fd7817ee	radeonsi: update shader count for compute shaders Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	da88c2a8e8	radeonsi: set maximum work group size based on block size Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	b082147b78	radeonsi: implement shared atomics v2: - Use single region - Use get_memory_ptr Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	8acf3e501b	radeonsi: implement shared memory load/store v2: - Use single region - Combine address calculation Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	84a6761ae3	radeonsi: add shared memory Declares the shared memory as a global variable so that LLVM is aware of it and it does not conflict with passes like AMDGPUPromoteAlloca. v2: - Use ctx->i8. - Dropped null-check for declare_memory_region. - Changed memory region array to single region. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	753a3e472b	radeonsi: lower compute shader arguments Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	008d977d01	radeonsi: Use CE for all descriptors. v2: Load previous list for new CS instead of re-emitting all descriptors. v3: Do radeon_add_to_buffer_list in si_ce_upload. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	0b6c463dac	gallium/util: Add u_bit_scan_consecutive_range64. For use by radeonsi. v2: Make sure that it works for all 64 bits set. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	058b54c624	radeonsi: Replace list_dirty with a mask. We can then upload only the dirty ones with the constant engine. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	aabc7d61d6	radeonsi: Add CE uploader. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	0d7ddd6819	radeonsi: Allocate chunks of CE ram. v2: Use 32 byte alignment. v3: Don't allocate CE space for vertex buffer descriptors. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	86c71ff989	radeonsi: Add CE synchronization. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	fe1ef23b66	radeonsi: Add CE packet definitions. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	8fee75d606	radeonsi: Create CE IB. Based on work by Marek Olšák. v2: Add preamble IB. Leaves the load packet in the space calculation as the radeon winsys might not be able to support a premable. The added space calculation may look expensive, but is converted to a constant with (at least) -O2 and -O3. v3: - Fix code style. - Remove needed space for vertex buffer descriptors. - Fail when the preamble cannot be created. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	7201230582	winsys/amdgpu: Enlarge const IB size. Necessary to prevent performance regressions due to extra flushing. Probably should enlarge it even further when also updating uniforms through the CE, but this seems large enough for now. v2: Add preamble IB. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Marek Olšák	7997b5f005	winsys/amdgpu: Add support for const IB. v2: Use the correct IB to update request (Bas Nieuwenhuizen) v3: Add preamble IB. (Bas Nieuwenhuizen) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Marek Olšák	e78170f388	winsys/amdgpu: split IB data into a new structure in preparation for CE Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-19 18:10:30 +02:00
Marek Olšák	f4b77c764a	gallium/radeon: move ring_type into winsyses Not used by drivers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-19 18:10:30 +02:00
Jose Fonseca	1d2ac7a7ca	llvmpipe: Call LLVMShutdown before exiting. So that LLVM frees its globals. Trivial.	2016-04-19 12:10:09 +01:00
Jose Fonseca	524042fa35	llvmpipe: Avoid LLVMGetGlobalContext in tests. Trivial.	2016-04-19 12:10:02 +01:00
Jose Fonseca	bb9e8c5090	llvmpipe: Skip false exp2 failure in lp_test_arit due to buggy MSVCRT. 64bits MSVCRT's exp2f(-inf) returns -inf instead of 0. Tested with MSVC 2013's CRT. (I haven't tried 2015 yet.) Also this does not happen with MinGW. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:53 +01:00
Jose Fonseca	ee9876be1d	llvmpipe: Test more vector lengths. All power of two of up native vector length. There is actually a bug in lp_build_round for v2, whereby it doesn't round to nearest. Fixing is left to the future, but the test is now able to expect it to fail. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:44 +01:00
Jose Fonseca	932b71f17d	gallivm: Avoid llvm::sys::getProcessTriple(). Just use LLVM_HOST_TRIPLE, which is available at least from LLVM 3.3 onwards, and is pretty much what llvm::sys::getProcessTriple() does anyway, Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:37 +01:00
Jose Fonseca	b5ca689cee	gallivm: Remove lp_get_module_id. Just keep a copy of the module_name in gallivm. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:26 +01:00
Jose Fonseca	969ba8bfa7	gallivm: Fix MCJIT with LLVM 3.3. One needs to call setJITMemoryManager for LLVM 3.3, instead of setMCJITMemoryManager. This regressed in commits 065256df/75ad4fe7 when trying to make the code to build with LLVM 3.6. Tested MCJIT with LLVM 3.3 to 3.6. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:17 +01:00
Jose Fonseca	cf4105740f	gallivm: Make MCJIT a runtime option. On the LLVM versions that support it, so we can easily switch between MCJIT/old-jit for testing. The new option is GALLIVM_MCJIT. Unfortunately setting GALLIVM_MCJIT=1 for LLVM 3.3 or 3.4 causes segfault, both on Linux and Windows. I'm almost certain this used to work, so there probably is a regression somewhere. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:14 +01:00
Jose Fonseca	7d2151b6ea	scons: Show the unit test full path. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:11 +01:00
Jose Fonseca	2211f8d559	gallivm: Use LLVMSetTarget. Instead of LLVM C++ interfaces. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:00 +01:00
Jose Fonseca	9aa23b11e4	gallivm: Use LLVMPrintValueToString where available. And llvm::raw_string_ostream where not (LLVM 3.3). Thereby eliminating yet another dependency on unstable LLVM interfaces. As a bonus this also gets LLVM IR on OutputDebugMessageA on MSVC (which was disabled, probably due to C++ issues.) Tested `lp_test_arit -v -v` on LLVM 3.3, 3.4 and 3.8. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:28:37 +01:00
Jose Fonseca	f6621cd3be	gallium/tests: Update UTIL_FORMAT_MAX_* defines. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:28:16 +01:00
Jose Fonseca	121a0cedc8	Revert "nv50/ra: `isinf()` is in namespace `std` since C++11." This reverts commit `f525db6358`. It was superseeded by commit `649704f1f7`.	2016-04-19 11:22:45 +01:00
Eric Anholt	802b9292aa	vc4: Fix fbo-generatemipmap-formats for NPOT. Single-sampled texture miplevels > 1 are stored in POT-aligned areas, but we only get one value to control the stride of the src and dst for single sampled buffers. A RCL tile blit from level != 1 to level == 0 would therefore load from the wrong stride.	2016-04-18 16:55:36 -07:00
Eric Anholt	2402bb6095	vc4: Remove unused "immediates" field This was for TGSI, which we no longer have to deal with.	2016-04-18 16:48:45 -07:00
Ben Widawsky	2408899cb2	i965: Define miptree map functions static (trivial) They were already declared as such. It was changed here: commit `31f0967fb5` Author: Ian Romanick <ian.d.romanick@intel.com> Date: Wed Sep 2 14:43:18 2015 -0700 i965: Make intel_miptree_map_raw static Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-18 16:12:13 -07:00
Matt Turner	b1d9353cb5	glsl: Properly handle ldexp(0.0f, non-zero-exp).	2016-04-18 15:48:54 -07:00
Dave Airlie	3a26ef23e7	gallivm: convert size query to using a set of parameters. This isn't currently that easy to expand, so fix it up before expanding it later to include dynamic samplers. [airlied: use some local variables (Roland)] Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-19 07:33:39 +10:00
Tim Rowley	3227c10270	swr: dereference cbuf/zbuf/views on context destroy Fixes resource memory leaks. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-18 15:52:26 -05:00
Rob Clark	77a9107bf2	freedreno/ir3: fix grouping issue w/ reverse swizzles When we have something like: MOV OUT[n], IN[m].wzyx the existing grouping code was missing a potential conflict. Due to input needing to be sequential scalar regs, we have: IN: x <-> y <-> z <-> w which would be grouped to: OUT: w <-> z2 <-> y2 <-> x (where the 2 denotes a copy/mov) but that can't actually work. We need to realize that x and w are already in the same chain, not just that they aren't both already in new chain being built. With this fixed, we probably no longer need the hack from `f68f6c0`. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-18 15:41:32 -04:00
Marek Olšák	ed66c75784	radeonsi: use enums in si_shader.h Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	0c52caf7b7	gallium/radeon: use enums in r600_query.h Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	dd9ca77cb9	radeonsi: always use PFP_SYNC_ME when doing flushes and waits This is typically used by the closed driver before SURFACE_SYNC. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	1db5678688	radeonsi: don't do VS/PS partial flushes if SURFACE_SYNC waits too Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	58494b42b5	radeonsi: add safety assertions for meta cache flushes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	78f58a4e6f	radeonsi: don't use ACQUIRE_MEM on the graphics ring It's only required on the compute ring. This matches the closed driver. The compute flag is removed to prevent confusion and Bas's compute shader patches remove it in the whole function. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	3faecdd4e1	radeonsi: remove TODO and correct a comment in si_emit_cache_flush Yes, that flag is really needed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	28c2573b4f	radeonsi: don't flush CB/DB caches for performance counters I'm not sure about this. This will make the engines go idle, but the caches will be unflushed. This should match app behavior without performance counters, which can be a good thing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:24 +02:00
Marek Olšák	97c328b2a3	gallium/radeon: don't flush CB/DB caches for timestamp queries Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:24 +02:00
Marek Olšák	6dc21b1962	gallium/util: fix undefined shift to the last bit in u_bit_scan Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:24 +02:00
Marek Olšák	9434aa8103	gallium/util: fix u_bit_scan_consecutive_range for mask == 0xffffffff The second ffs returns 0, yielding count == -1. v2: change 1 to 1u Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-18 19:51:24 +02:00
Marek Olšák	e50e1f86b0	gallium/radeon: fix Nine with its slightly shifted viewports just need to do the calculation in floating-point and then round things properly Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-04-18 19:51:24 +02:00
Erik Faye-Lund	ee5b35142a	docs: correct name for GL_OES_primitive_bounding_box When this extension was added, an underscore were mistakenly replaced by a space. Let's correct this, so it's a tad easier to grep for this extension. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>	2016-04-18 10:48:57 -07:00
Kenneth Graunke	c092f9b96a	meta: Don't botch color masks when changing drawbuffers. Color clears should respect each drawbuffer's color mask state. Previously, we tried to leave the color mask untouched. However, _mesa_meta_drawbuffers_from_bitfield() ended up rebinding all the color drawbuffers in a different order, so we ended up pairing drawbuffers with the wrong color mask state. The new _mesa_meta_drawbuffers_and_colormask() function does the same job as the old _mesa_meta_drawbuffers_from_bitfield(), but also rearranges the color mask state to match the new drawbuffer configuration. This code was largely ripped off from Gallium's st_Clear code. This fixes ES31-CTS.draw_buffers_indexed.color_masks, which binds up to 8 drawbuffers, sets color masks for each, and then calls glClearBufferfv to clear each buffer individually. ClearBuffer causes us to rebind only one drawbuffer, at which point we used ctx->Color.ColorMask[0] (draw buffer 0's state) for everything. We could probably delete _mesa_meta_drawbuffers_from_bitfield(), but I'd rather not think about the i965 fast clear code. Topi is rewriting a bunch of that soon anyway, so let's delete it then. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94847 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-18 10:39:31 -07:00
Kenneth Graunke	a33f94ba8c	meta: Don't smash ColorMask when using MESA_META_COLOR_MASK save bit. This allows meta operations to inspect the existing color mask, and then do their own smashing. BlitFramebuffer and Clear already override the color mask, so this was also redundant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-18 10:39:26 -07:00
Eric Anholt	48fe53bbb9	vc4: Add support for rendering to cube map surfaces. We need to fix up the offset to point at the face of the cube. Fixes piglit fbo-cubemap, copyteximage CUBE, and glean's fbo test. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-18 10:10:44 -07:00
Eric Anholt	21a9ed6207	vc4: Don't flush on read-only access of buffers read by the CL. Fixes piglit mixed-immediate-and-vbo, and may significantly improve performance of applications that store a 4-byte IB in the same VBO as vertex data.	2016-04-18 10:10:44 -07:00
Eric Anholt	9e8a8b0c8b	vc4: Sanity check that flushes don't happen between state emit and draw. Catches the cause of failure in arb_vertex_buffer_object-mixed-immediate-and-vbo, I've had this class of failure before, and it probably won't be the last time.	2016-04-18 10:10:44 -07:00
Eric Anholt	56b14adf85	vc4: Sanity check strides for imported BOs. If we're going to sample from or render to them at some particular size, we'd better make sure that they actually are that size. Causes some tests under simulation to generate appropriate error messages instead of failures.	2016-04-18 10:10:44 -07:00
Pierre Moreau	649704f1f7	math: Import isinf and others to global namespace Starting from C++11, several math functions, like isinf, moved into the std namespace. Since cmath undefines those functions before redefining them inside the namespace, and glibc 2.23 defines the C variants as macros, the C variants in global namespace are not accessible any longer. v2: Move the fix outside of Nouveau, as suggested by Jose Fonseca, since anyone might need it when GCC switches to C++14 by default with GCC 6.0. v3: * Put the code directly inside c99_math.h rather than creating a new header file, as asked by Jose Fonseca; * Guard the code behind glibc version checks, as only glibc > =2.23 defines isinf & co. as functions, as suggested by Jose Fonseca. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-18 11:10:25 +01:00
Oded Gabbay	d3c98c73dc	r600g: Move R600_BIG_ENDIAN to r600_pipe_common.h I need to do this so I could use R600_BIG_ENDIAN in files which include r600_pipe_common.h but not r600_pipe.h Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-18 09:50:08 +03:00
Oded Gabbay	72d0d2ba59	r600g: fix code indentation Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-18 09:50:08 +03:00
Emil Velikov	a998e49259	docs: add news item and link release notes for 11.1.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-17 23:32:41 +01:00
Emil Velikov	50eeb5fb16	docs: add sha256 checksums for 11.1.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `596c6504b3`)	2016-04-17 23:32:41 +01:00
Emil Velikov	c1bf47ada2	docs: add release notes for 11.1.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `ca2fbf6f8f`)	2016-04-17 23:32:41 +01:00
Roland Scheidegger	d11111a551	gallivm: don't use vector selects with llvm 3.7 llvm 3.7 sometimes simply miscompiles vector selects. See https://bugs.freedesktop.org/show_bug.cgi?id=94972 This was fixed in llvm r249669 (https://llvm.org/bugs/show_bug.cgi?id=24532). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-18 00:23:34 +02:00
Dave Airlie	b3616f1326	nir: only dereference undef after NULL check. (v2) Pointed out by coverity. v2: nuke line, Jason pointed out the constructor does it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-18 07:37:48 +10:00
Emil Velikov	96b4cfe834	docs: update the sha256 checksums for 11.2.1 Turns out the previous tarballs got corrupted during upload which I carelessly forgot to check prior to deleting the local ones. Lesson learned - double check before removing the local ones. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `79b0e13913`)	2016-04-17 19:32:20 +01:00
Emil Velikov	2197581816	docs: add news item and link release notes for 11.2.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-17 18:36:59 +01:00
Emil Velikov	03a234c1d1	docs: add sha256 checksums for 11.2.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `c65835d812`)	2016-04-17 18:36:59 +01:00
Emil Velikov	c15f457958	docs: add release notes for 11.2.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `21e6440e82`)	2016-04-17 18:36:59 +01:00
Jason Ekstrand	f30f6e2625	i965/fs: Don't allow OOB array access of images We have had a guard against OOB array access of images on IVB for a long time, but it can actually cause hangs on any GPU generation. This can happen due to getting an untyped SURFACE_STATE for a typed message. We didn't used to hit this with the piglit test on anything other than IVB because the OOB in the test would cause us to go past the top of the pull constant UBO and we would get a surface index of 0 which is was always a valid surface. Now that we're pushing small arrays, we can end up grabbing garbage from the GRF and going to some random index which causes a hang. The solution is to just do the bounds check on all hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94944 Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-04-15 22:47:33 -07:00
Jason Ekstrand	93db828e42	anv/device: Images are only enabled in scalar stages Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-15 16:40:56 -07:00
Marek Olšák	c1a2fe7fd1	gallium/radeon: handle vertex shaders that disable clipping & viewport Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-16 00:21:15 +02:00
Nanley Chery	696d8ff5a1	mesa/texstore: Use Driver.CompressedTexSubImage in the default CompressedTexImage Enable drivers to use their own implementation of this method instead of the mesa default. Since the drivers that currently overwrite dd_function_table::CompressedTexSubImage also overwrite ::CompressedTexImage, there should be no behavioral change. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-15 15:06:27 -07:00
Jason Ekstrand	5ec4ecce44	anv: Advertise vertexPipelineStoresAndAtomics based on scalar stages Previously, we just looked at the hardware generation but this meant that if you did INTEL_DEBUG=vec4 on BDW or SKL, you would have advertised but non-working features.	2016-04-15 14:53:16 -07:00
Jason Ekstrand	0166ad6ced	i965/vec4: Support full std140 layout for push constants Up until now, we have been able to assume that all push constants are vec4-aligned because this is what the GL driver gives us. In Vulkan, we need to be able to support full std140 because we get the layout from the client. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 14:04:38 -07:00
Jason Ekstrand	a112391d52	i965/vec4: Handle MOV_INDIRECT in pack_uniform_registers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 14:04:38 -07:00
Jason Ekstrand	aaac8a1890	i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 14:04:38 -07:00
Jason Ekstrand	61ee5e62a2	i965/vec4: Use can_do_writemask in can_reswizzle Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 14:04:38 -07:00
Jason Ekstrand	75b68f9114	i965/vec4: Move can_do_writemask to vec4_instruction Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 14:04:37 -07:00
Chad Versace	4a80890177	util: Fix warning of invalid return value _mesa_libgcrypt_init() returns NULL, but its return type is void. Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2016-04-15 15:00:58 -07:00
Jason Ekstrand	cab30cc5f9	Merge branch 'vulkan'	2016-04-15 13:52:34 -07:00
Roland Scheidegger	64d3ae09b7	llvmpipe: (trivial) initialize src1_alpha var to NULL The blend code would do a conditional assignment based on it, causing valgrind to complain. Since that variable was actually unused in this case, this doesn't fix anything but the warning. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-15 22:51:28 +02:00
Jason Ekstrand	d8b85c96d1	Merge remote-tracking branch 'public/master' into vulkan	2016-04-15 13:35:16 -07:00
Jason Ekstrand	1a100d4f28	configure: Add support for the Intel Vulkan driver This adds a --with-vulkan-drivers option with one driver, "intel". In the future, we may add more drivers to this list. v2: Don't enable any drivers by default. This should prevent this patch from breaking anyone's build. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 13:29:29 -07:00
Jason Ekstrand	ce7e82fb6f	i965/surface_formats: Update some formats for more recent gens The surface format table hasn't entirely been kept up-to-date. This commit marks a couple more compressed formats as sampleable on gen8+ and adds the A4B4G4R4 format as renderable on gen9. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 13:29:29 -07:00
Jason Ekstrand	7dac4a2889	util/list: Add list splicing functions This adds functions for splicing one list into another. These have more-or-less the same API as the kernel list splicing functions. The implementation, however, was stolen from the Wayland list implementation. Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2016-04-15 13:29:09 -07:00
Jason Ekstrand	17a181bfa6	Remove the Intel Vulkan readme	2016-04-15 13:17:08 -07:00
Tim Rowley	082f6d75ae	gallium/swr: confine c++11 flag to swr driver On the philosophy that a driver shouldn't change the compile flags for the entire tree, take the clove approach of moving the c++11 flag to the swr driver directory. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-15 14:43:01 -05:00
Tim Rowley	ee72fec9cf	gallium/swr: allow swr use as a swrast dri driver Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-15 14:21:50 -05:00
Eric Anholt	f6d21bcd6b	vc4: Fix subimage accesses to LT textures. This code started out like the T case, iterating over utile offsets, but I had partially switched it to iterating over pixel offsets. I hadn't caught this before because it's unusual to do piecemeal uploads to small textures. Fixes bad text rendering in QT5 apps, which use a 256x16 glyph cache. Also fixes 6 piglit tests related to glTexSubImage() and glGetTexSubImage(). Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-15 11:57:17 -07:00
Mark Janes	ade3108bb5	util: Fix race condition on libgcrypt initialization Fixes intermittent Vulkan CTS failures within the test groups: dEQP-VK.api.object_management.multithreaded_per_thread_device dEQP-VK.api.object_management.multithreaded_per_thread_resources dEQP-VK.api.object_management.multithreaded_shared_resources Signed-off-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94904 Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-15 10:24:40 -07:00
Jason Ekstrand	8403e6de9f	i965: Default to scalar GS	2016-04-15 09:54:42 -07:00
Jason Ekstrand	17d9a2b011	i965/surface_formats: Mark A4B4G4R4_UNORM as SKL+ only This is what is indicated by the bspec.	2016-04-15 09:53:55 -07:00
Jason Ekstrand	d7189bdeee	Revert "i965/fs: Properly write-mask spills" This reverts commit `9c0109a1f6`.	2016-04-15 09:53:55 -07:00
Jason Ekstrand	c3362453f9	Revert "i965/fs: Feel free to spill partial reads/writes" This reverts commit `2434ceabf4`.	2016-04-15 09:53:55 -07:00
Jason Ekstrand	2d5bd66e4f	configure: Add support for detecting valgrind headers We have several places where the Vulkan driver explicitly hooks into valgrind when it's available. We need to be able to detect it. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-15 09:41:15 -07:00
Eduardo Lima Mitev	7e4628da48	nir/print: Fix printing variable mode nir_variable_mode is currently a bitflag enum, while nir_print::print_var_decl() assumes is still a numbered list. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-15 16:41:41 +02:00
John Sheu	f8752e0d95	xlib: remove MESA_GLX_VISUAL_HACK This removes a hack introduced in 1999 in the first version of fakeglx.c, with the comment: /* XXX revisit this after 3.0 is finished. */ Mesa 4.0 was released in 2001. It is now 2016, and Mesa 11.0 was released last year. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-04-15 07:46:00 +02:00
John Sheu	8a9c0f1025	xlib: fix leaks of returned values from XGetVisualInfo Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-04-15 07:45:46 +02:00
John Sheu	781232e0ac	xlib: fix memory leak of and remove vishandle from XMesaVisualInfo The vishandle member of XMesaVisualInfo is used to support the comparison of XVisualInfo instances by pointer value, in find_glx_visual(). The comparison however will always be false, as in every case the comparison is made, the VisualInfo instance being compared to is a new allocation passed in through a GLX API call. In addition, the XVisualInfo instance pointed to by vishandle is itself never freed, causing a memory leak. Since vishandle is essentially useless, we just remove it and thereby also fix the leak. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-04-15 07:45:28 +02:00
John Sheu	fe9d8cd79e	xlib: do not cache return value of glXChooseVisual/glXGetVisualFromFBConfig The returned XVisualInfo from glXChooseVisual/glXGetVisualFromFBConfig is being cached in XMesaVisual.vishandle (and unconditionally overwritten on subsequent calls). However, these entry points are specified to return XVisualInfo instances to be owned by the caller and freed with XFree(), so the return values should not be retained. With this change, XMesaVisual.vishandle is essentially unused and will be removed in a subsequent change. v2: update commit message Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-04-15 07:44:34 +02:00
Jason Ekstrand	76fa7b16f4	Merge remote-tracking branch 'public/master' into vulkan	2016-04-14 18:30:52 -07:00
Jason Ekstrand	547032c56a	main/mtypes: Remove the "set" parameter from gl_uniform_block This is a left-over from the early days of the Vulkan driver	2016-04-14 18:27:09 -07:00
Jason Ekstrand	f0bbb34e49	Revert "i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT" This reverts commit `4115648a6b`. This commit was half-baked and probably never should have been committed. We'll add this back in properly later when we need it.	2016-04-14 18:22:08 -07:00
Jason Ekstrand	eeff133158	i965: Expose the surface format table Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 18:07:48 -07:00
Jason Ekstrand	d7cddbd6d6	nir/lower_io: Add UBOs and SSBOs to get_io_offset_src Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 18:07:40 -07:00
Jason Ekstrand	c825e29a82	nir/intrinsics: Add a vulkan_resource_index intrinsic This is used to facilitate the Vulkan binding model where each resource is described by a (descriptor set, binding, array index) tuple. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-14 17:20:05 -07:00
Jason Ekstrand	1e0012e3e4	nir: Add a descriptor_set field to nir_variable This is needed for supporting the Vulkan binding model Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-14 17:20:05 -07:00
Chad Versace	7a835b3fd9	dri: Fix robust context creation via EGL attribute driCreateContextAttribs() emits an error if bit __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS is set for an ES context. But, EGL_EXT_create_context_robustness and EGL 1.5 both allow creation of robust ES contexts. One requests a robust ES context by setting the EGL_CONTEXT_OPENGL_ROBUST_ACCESS attribute, which Mesa's EGL layer translates into the __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS bit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-14 17:38:41 -07:00
Jason Ekstrand	5567ae0547	Merge remote-tracking branch 'public/master' into vulkan	2016-04-14 17:14:28 -07:00
Leo Liu	8f4340c5e6	radeon/uvd: fix tonga feedback buffer size This only applies to tonga Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-14 19:33:44 -04:00
Jason Ekstrand	f1d29099b4	i965: Push everything if pull_param == NULL Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 16:00:18 -07:00
Jason Ekstrand	963513bb24	i965/fs: Push small uniform arrays Unfortunately, this also means that we need to use a slightly different algorithm for assign_constant_locations. The old algorithm worked based on the assumption that each read of a uniform value read exactly one float. If it encountered a MOV_INDIRECT, it would immediately bail and push the whole thing. Since we can now read ranges using MOV_INDIRECT, we need to be able to push a series of floats without breaking them up. To do this, we use an algorithm similar to the on in split_virtual_grfs. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	71f8039f72	i965/fs: Rename demote_pull_constants to lower_constant_loads Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	8e76f664be	i965/vec4: Get rid of the uniform_size array Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	056849772f	i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	479e38ad63	i965/fs: Get rid of the param_size array Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	30874216cb	i965/fs: Stop relying on param_size in assign_constant_locations Now that we have MOV_INDIRECT opcodes, we have all of the size information we need directly in the opcode. With a little restructuring of the algorithm used in assign_constant_locations we don't need param_size anymore. The big thing to watch out for now, however, is that you can have two ranges overlap where neither contains the other. In order to deal with this, we make the first pass just flag what needs pulling and handle assigning pull constant locations until later. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	275855f315	i965/fs: Get rid of reladdr We aren't using it anymore. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	3c93cdfaf5	i965/fs: Use MOV_INDIRECT for all indirect uniform loads Instead of using reladdr, this commit changes the FS backend to emit a MOV_INDIRECT whenever we need an indirect uniform load. We also have to rework some of the other bits of the backend to handle this new form of uniform load. The obvious change is that demote_pull_constants now acts more like a lowering pass when it hits a MOV_INDIRECT. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	63101177f3	nir: Add another index to load_uniform to specify the range read Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	27bd8ac6f3	i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware While we're at it, we also add support for the possibility that the indirect is, in fact, a constant. This shouldn't happen in the common case (if it does, that means NIR failed to constant-fold something), but it's possible so we should handle it. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	889e6054b7	i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr The subnr field is in bytes so we don't need to multiply by type_sz. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	7e08a13009	i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions It should work fine without it and the visitor can set it if it wants. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	40a8fe04dc	i965/fs: Add support for doing MOV_INDIRECT on uniforms Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	48cc8c284a	anv: Install the installable ICD	2016-04-14 15:15:00 -07:00
Jason Ekstrand	e40b867145	anv/intel_icd: Don't provide an absolute path The driver will be installed to $(libdir)/libvulkan_intel.so and just providing a driver name is enough for the loader. This also ensures that multi-arch systems work ok.	2016-04-14 15:15:00 -07:00
Jason Ekstrand	ca16373a2b	configure: Add initial support for enabling Vulkan drivers	2016-04-14 15:15:00 -07:00
Jason Ekstrand	e61c812f76	anv/pipeline: Use the right mask for lower_indirect_derefs	2016-04-14 15:13:29 -07:00
Ben Widawsky	a8975a91cc	i965: Make intel_get_param return an int This will fix the spurious error message: "Failed to query GPU properties." that was unintentionally added in `cc01b63d73`. This patch changes the function to return an int so that the caller is able to do stuff based on the return value. The equivalent of this patch was in the original series that fixed up the warning, but I dropped it at the last moment. It is required to make the desired behavior of not warning when trying to query GPU properties from the kernel unless there is something the user can do about it. v2: Use strerror (Jason) Make EINVAL check similar in all places (Ian) NOTE: Broadwell appears to actually have some issue where the kernel returns ENODEV when it shouldn't be. I will investigate this separately. Reported-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-04-14 15:13:22 -07:00
Brian Paul	aed975d5c5	st/mesa: fix sampler view leak in st_DrawAtlasBitmaps() I neglected to free the sampler view which was created earlier in the function. So for each glCallLists() command that used the bitmap atlas to draw text, we'd leak a sampler view object. Also, check for st_create_texture_sampler_view() failure and record GL_OUT_OF_MEMORY. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-04-14 15:32:18 -06:00
Nicolai Hähnle	a17911ceb1	gallium/radeon: handle failure when mapping staging buffer Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-14 16:29:23 -05:00
Nicolai Hähnle	8bd0f0df50	radeonsi: mark ssbo and images descriptor pointers dirty at beginning of CS Without this, we were getting non-deterministic VM faults under high pressure. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-14 16:29:23 -05:00
Jason Ekstrand	cb372b39ea	i965/vec4: Use UD rather than D for uniform indirects Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 14:25:01 -07:00
Jason Ekstrand	240d16ea94	i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 14:24:57 -07:00
Samuel Pitoiset	bb4cdee9a4	nvc0: do not break the universe on GK110+ I removed that return 0 by mistake. Ooops. Fixes: `6e23fd4` ("nvc0: allow to use compute support on GM200") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-14 21:57:21 +02:00
Samuel Pitoiset	6e23fd420d	nvc0: allow to use compute support on GM200 This works like a charm but please not that NVF0_COMPUTE have to be set because compute support is still not enabled by default on GK110+. This will require more testing to make sure it won't break the 3D state. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-14 21:01:51 +02:00
Jason Ekstrand	34b5db17d9	i965: remove pointless diff with the master branch	2016-04-14 10:39:54 -07:00
Jason Ekstrand	769b5614f8	nir/opt_algebraic: Remove the encoding line This is an unneeded diff between the vulkan and master branches	2016-04-14 10:35:40 -07:00
Jason Ekstrand	c34be07230	spirv: Move to compiler/ While it does rely on NIR, it's not really part of the NIR core. At the moment, it still builds as part of libnir but that can be changed later if desired.	2016-04-14 10:28:47 -07:00
Jason Ekstrand	bfa3a38280	nir: Remove some pointless delta between vulkan and master	2016-04-14 10:24:33 -07:00
Jose Fonseca	ffcc00ce30	scons: Build NIR. Emil Velikov: - Attribute the src/{glsl,compiler}/nir move - Flesh out to separate SConscript Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:59 +01:00
Jose Fonseca	feb6732e80	nir: Use _snprintf on Windows. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:37 +01:00
Jose Fonseca	ba0c0e3940	nir: Avoid structure initalization expressions. Not supported by MSVC, and completely unnecessary -- inline functions work just as well. NIR_SRC_INIT/NIR_DEST_INIT could and probably should be replaced by the inline functions. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:37 +01:00
Jose Fonseca	8f96524f13	nir: Remove unistd.h include. It doesn't seem needed, and is not available on MSVC. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:31 +01:00
Jose Fonseca	f8e2f1fba5	nir: Avoid empty {} struct initializer. Not supported by MSVC and consistent through NIR. [Emil Velikov: rebase] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:33:52 +01:00
Emil Velikov	bb949e262c	gallium/swr: fold the almost identical Makefiles Rather than having two almost identical Makefiles, with various VPATH hacks just fold them, using COMMON_* variables and actually getting things buildable/shipable. v2: whitespace fixes, remove Makefile.sources-arch Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-04-14 16:30:57 +01:00
Tim Rowley	aee976703d	install-gallium-links.mk: handle multiple libraries Need to prevent bash from interpreting whitespace between libraries as a command line. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-14 16:30:57 +01:00
Marek Olšák	112291964e	radeonsi: don't overwrite the scratch offset in shader prologs Prologs only look at num_input_sgprs. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-14 17:00:14 +02:00
Marek Olšák	ffe44d0283	radeonsi: fold num_user_sgprs where it is possible Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-14 17:00:14 +02:00
Marek Olšák	51c4034f9b	radeonsi: fix SGPRS calculation once more This fixes GS piglit failures after adding SI_PARAM_SHADER_BUFFERS, which bumped NUM_USER_SGPRS and uncovered this bug on SI. If this was fixed in LLVM, these workarounds wouldn't be needed. LLVM would have to look at the calling convention to know how many SGPR inputs are declared, and add VCC and the scratch wave offset (which is enabled even if we spill SGPRs but not VGPRs, oh well). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-14 17:00:14 +02:00
Marek Olšák	aaf5be4a29	radeonsi: disable hw ETC2 on Polaris not supported by hw directly, but it's still fully supported by the driver Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-14 16:58:59 +02:00
Emil Velikov	4358cfc4ad	doxygen: remove git rebase fallouts Should never have been (git) added in the first place. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-14 09:49:09 +01:00
Jose Fonseca	8fcacb4f90	appveyor: Run unit tests. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-14 07:19:04 +01:00
Jose Fonseca	50ddf03ada	scons: Add a "check" target to run all unit tests. Except: - u_cache_test -- too long - translate_test -- unreliable (it's probably testing corner cases that translate module doesn't care about.) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-14 07:19:04 +01:00
Jose Fonseca	9ae0e8ee3c	test/unit: Make translate_test invoke translate_create by default. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-14 07:19:04 +01:00
Jose Fonseca	f8a51034bd	test/unit: Make pipe_barrier_test actually check correct bahavior. So it can run unattended. Also make it silent by default. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-14 07:19:04 +01:00
Jason Ekstrand	12f88ba32a	Merge remote-tracking branch 'public/master' into vulkan	2016-04-13 20:25:39 -07:00
Michel Dänzer	171a570f38	clover: Fix build against LLVM SVN >= r266163 createInternalizePass now takes a callback instead of a StringSet. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-04-14 11:53:41 +09:00
Nanley Chery	79fbec30fc	anv: Remove default scissor and viewport concepts Users should never provide a scissor or viewport count of 0 because they are required to set such state in a graphics pipeline. This behavior was previously only used in Meta, which actually just disables those hardware operations at pipeline creation time. Kristian noticed that the current assignment of viewport count reduces the number of viewport uploads, so it is not removed. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 18:02:38 -07:00
Nanley Chery	1949e502bc	anv: Replace ::disable_scissor with ::use_rectlists Meta currently uses screenspace RECTLIST primitives that lie within the framebuffer rectangle. Since this behavior shouldn't change in the future, disable the scissor operation whenever rectlists are used. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 18:00:41 -07:00
Nanley Chery	9f72466e9f	anv: Delete anv_graphics_pipeline_create_info::disable_viewport There are no users of this field. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 18:00:41 -07:00
Nanley Chery	cff0f6b027	gen{7,8}_pipeline: Always set ViewportXYClipTestEnable For the following reasons, there is no behavioural change with this commit: the ViewportXYClipTest function of the CLIP stage will continue to be enabled outside of Meta (where disable_viewport is always false), and the CLIP stage is turned off within Meta, so this function will continue to be disabled in that case. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 18:00:41 -07:00
Nanley Chery	992bbed98d	gen{7,8}_pipeline: Apply 3DPRIM_RECTLIST restrictions According to 3D Primitives Overview in the Bspec, when the RECTLIST primitive is in use, the CLIP stage should be disabled or set to have a different Clip Mode, and Viewport Mapping must be disabled: Clipping: Must not require clipping or rely on the CLIP unit’s ClipTest logic to determine if clipping is required. Either the CLIP unit should be DISABLED, or the CLIP unit’s Clip Mode should be set to a value other than CLIPMODE_NORMAL. Viewport Mapping must be DISABLED (as is typical with the use of screen-space coordinates). We swap out ::disable_viewport for ::use_rectlist, because we currently always use the RECTLIST primitive when we disable viewport mapping, and we'll likely continue to use this primitive. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 17:53:38 -07:00
Nanley Chery	88d1c19c9d	anv_cmd_buffer: Don't make the initial state dirty Avoid excessive state emission. Relevant state for an action command will get set by the user: From Chapter 5. Command Buffers, When a command buffer begins recording, all state in that command buffer is undefined. [...] Whenever the state of a command buffer is undefined, the application must set all relevant state on the command buffer before any state dependent commands such as draws and dispatches are recorded, otherwise the behavior of executing that command buffer is undefined. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 17:52:24 -07:00
Nanley Chery	9fae6ee026	anv/meta: Don't set the dynamic state for disabled operations CmdSet* functions dirty the CommandBuffer's dynamic state. This causes the new state to be emitted when CmdDraw is called. Since we don't need the state that would be emitted, don't call the CmdSet* functions. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 17:52:20 -07:00
Nanley Chery	76b0ba087c	anv/clear: Disable the scissor operation Since the scissor rectangle always matches that of the framebuffer, this operation isn't needed. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 17:45:18 -07:00
Jason Ekstrand	b63a98b121	nir/dead_variables: Configurably work with any variable mode The old version of the pass only worked on globals and locals and always left inputs, outputs, uniforms, etc. alone. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-13 15:45:10 -07:00
Kenneth Graunke	505a8fbdf8	i965: Switch to NIR for ldexp lowering. The old GLSL IR based lowering doesn't quite work right in all cases, and fails several dEQP-GLES31 and Vulkan CTS tests. Jason's new approach in NIR passes all the tests. There's not likely to be a ton of advantage to lowering early in GLSL IR anyway, so...switch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:33 -07:00
Jason Ekstrand	4455bfa9a0	nir/algebraic: Add lowering for ldexp The algorithm used is different from both the naive suggestion from the GLSL spec and the one used in GLSL IR today. Unfortunately, the GLSL IR implementation that we have today doesn't handle denormals (for those that care) or the case where the float source is +-inf. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:19 -07:00
Jason Ekstrand	765dd65349	i965: Implement the new imod and irem opcodes Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:08 -07:00
Jason Ekstrand	745b3d295e	nir: Add more modulus opcodes These are all needed for SPIR-V Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:00 -07:00
Jason Ekstrand	d880c6f9f5	i965/vec4: Inline get_pull_constant_offset It's not really doing enough anymore to justify a helper function. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-13 15:39:20 -07:00
Jason Ekstrand	dd616cab01	nir/lower_io: Allow for a full bitmask of modes Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:44:10 -07:00
Jason Ekstrand	2caaf0ac5e	nir/lower_indirect: nir_variable_mode is now a bitfield Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:44:07 -07:00
Jason Ekstrand	ffa0e12e15	nir: Convert nir_variable_mode to a bitfield There are several passes where we need to specify some set of variable modes that the pass needs top operate on. This lets us easily do that. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:40:12 -07:00
George Kyriazis	f69a61b1aa	gallium/swr: Make flat shading tris work. - Incorporate flatshade flag into the shader generation - Use provoking vertex (vc) in shader when flat shading. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-13 13:46:37 -05:00
Rob Clark	c53a12fedc	Revert "freedreno/a4xx: better occlusion/sample counting" This reverts commit `62fa868728`. dEQP-GLES3.functional.occlusion_query.* was unhappy about that change. Still not really sure what the other slots in the sample results buffer are. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:40 -04:00
Rob Clark	46e9bbc918	freedreno/a4xx: rasterizer_discard support This one is slightly annoying, since trying to write RBRC from draw would clobber values set in the tiling/gmem code. We could do command- stream patching for RBRC, as is done on a3xx. Although since it seems to be a rarely used feature, it is easier just to do RMW to set/clear the bit. Fixes dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_triangles and related tests. a3xx still needs the same feature, although there it probably makes more sense to take advantage of the existing cmdstream patching which is required for RBRC for other reasons. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:21 -04:00
Rob Clark	216225ce57	freedreno/ir3: fix array textures on a4xx Seems like a4xx needs offset added to array index for all arrays, whereas a3xx only for cubemap arrays. Fixes a whole swath of dEQP fails (roughly sampler2darray). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:14 -04:00
Rob Clark	7e93b26b5d	freedreno: fix stream-out offset handling for lines/tris We need to increment offset by # of vertices, not by # of prims. Fixes a bunch of dEQP fails involving prims other than points. For example, dEQP-GLES3.functional.transform_feedback.position.lines_separate Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:02 -04:00
Rob Clark	6ca6e80f61	freedreno: fix handling for stream-out offsets If changed && append, we shouldn't be resetting the internal offset back to zero. This fixes issues w/ sequences like: glBeginTransformFeedback() glDraw() glPauseTransformFeedback() glDraw() glResumeTransformFeedback() glDraw() glEndTransformFeedback() Fixes dEQP-GLES3.functional.transform_feedback.array.separate.points.lowp_vec3 and related tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:54 -04:00
Rob Clark	0a4b0fc315	freedreno: fix prims-emitted query This should only count when TF is not paused. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:47 -04:00
Rob Clark	a7eb12d089	freedreno: fix max-line-width dEQP noticed that we were advertising completely bogus values. The actual maximum is 127.0f. But we have to use an artifically low maximum to work around a bug in the dEQP test, which gets confused when the max line width is too large and lines start going off-screen. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:31 -04:00
Rob Clark	6bf462a1ab	freedreno: add flag to enable dEQP hacks Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:24 -04:00
Rob Clark	f68f6c0246	freedreno/ir3: hack to avoid getting stuck in a loop There are still some edge cases which result in a neighbor-loop. Which needs to be fixed, but this hack at least makes deqp tests finish. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:13 -04:00
Rob Clark	dd70945e09	freedreno/ir3: use (ss) instead of (sy) for ldlv Fixes a bunch of flat-varying fail on a4xx (where we need to use ldlv to read the un-interpolated varying). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:05 -04:00
Rob Clark	b35ad6e701	freedreno/ir3: cleanup double cmps.s from frontend Since we cannot mov into a predicate register, the frontend uses a 'cmps.s p0.x, cond, 0' as a stand-in for mov to p0.x. It does this since it has no way to know that the source cond instruction (ie. for a kill, br, etc) will only be used to write the predicate reg. Detect this, and re-write the instruction writing p0.x to skip the original cmps.[sfu]. (It is done like this, rather than re-writing the dest of the first cmps.[sfu] in case the first cmps.[sfu] actually has other users.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:14:41 -04:00
Matt Turner	9bac27dbf9	glsl: Rename "vertex_input_slots" -> "is_vertex_input" vertex_input_slots would be an appropriate name for an integer, but not a bool. Also remove a cond ? true : false from a count_attribute_slots() call site, noticed during the rename. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 11:00:21 -07:00
Jose Fonseca	9586468c03	gallivm: Workaround LLVM PR 27332. The credit for finding and isolating this bug goes to Vinson and Roland. The buggy LLVM versions were found by doing opt -instcombine llvm-pr27332.ll > /dev/null where llvm-pr27332.ll is the IR from https://llvm.org/bugs/show_bug.cgi?id=27332#c3 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-13 16:42:55 +01:00
Marek Olšák	dd0a296895	gallium/radeon: move a comment to the correct place trivial	2016-04-13 17:31:03 +02:00
Nicolai Hähnle	9e9a2bb44a	radeonsi: gate PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT by LLVM version Otherwise we incorrectly claim ARB_ssbo support even with older LLVM versions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94917 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-13 10:06:22 -05:00
Elie TOURNIER	f04565c876	doxygen: Generate Doxygen for NIR Now, one can do the following to generate and read the nir Doxygen: cd $MESA_TOP/doxygen make firefox nir/index.html Update v2: Correct TAGFILES in nir.doxy Signed-off-by: Elie TOURNIER <tournier.elie@gmail.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> [Emil Velikov] v3: Rebase. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:33 +01:00
Elie TOURNIER	3157df58d0	doxygen: update glsl link Signed-off-by: Elie TOURNIER <tournier.elie@gmail.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> Tested-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:30 +01:00
Rhys Kidd	0e9fc1228a	doxygen: Remove deprecated settings in common.doxy These Doxygen features are deprecated, as reported by Doxygen 1.8.9.1 Warning: Tag `USE_WINDOWS_ENCODING' at line 66 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `DETAILS_AT_TOP' at line 157 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `HTML_ALIGN_MEMBERS' at line 616 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `XML_SCHEMA' at line 848 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `XML_DTD' at line 854 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `MAX_DOT_GRAPH_WIDTH' at line 1115 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `MAX_DOT_GRAPH_HEIGHT' at line 1123 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:26 +01:00
Rhys Kidd	3d18ab72bf	doxygen: Fix typo in doxygen/tnl.doxy TAGFILE relative folder should match .tag file Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:23 +01:00
Rhys Kidd	4ba409a364	doxygen: Correct TAGFILE linkage of main core.doxy was renamed to main.doxy, along with output folder in the below 2004 commit. Correct the other modules' TAGFILE linkage to find the main folder. commit `3ef972f538` Author: Brian Paul <brian.paul@tungstengraphics.com> Date: Sun May 16 22:07:02 2004 +0000 Replaced 'core' with 'main'. Other minor updates. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:19 +01:00
Rhys Kidd	7703a3e3d0	doxygen: Update .gitignore The last of these output directories was removed in 2007. commit `c2e0570831` Author: Jerome Glisse <glisse@freedesktop.org> Date: Fri Feb 16 23:18:56 2007 +0100 Update doxygen doc to reflet vbo changes. Update doxygen doc, array_cache no longuer exist, new shiny vbo modules is there. Tested on unix, but i think i didn't broke that bat :). commit `3ef972f538` Author: Brian Paul <brian.paul@tungstengraphics.com> Date: Sun May 16 22:07:02 2004 +0000 Replaced 'core' with 'main'. Other minor updates. commit `69db632a9d` Author: Jose Fonseca <j_r_fonseca@yahoo.co.uk> Date: Thu May 1 23:32:54 2003 +0000 Move the Doxygen configuration files into the usual places and integrate with the build system. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:15 +01:00
Rhys Kidd	ced18f4d60	doxygen: Remove references to miniglx miniglx was removed in February 2010. Clean up remaining unnecessary doxygen references. commit `a9e3669683` Author: Kristian Høgsberg <krh@bitplanet.net> Date: Thu Feb 25 16:17:04 2010 -0500 Remove remaining miniglx references Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:12 +01:00
Rhys Kidd	29b805b929	doxygen: Fix doxygen/gbm.doxy TAGFILES There has never been a doxygen/gbm_setup output folder. Appears to have been a copy-paste error from original commit in `245341f406`. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:08 +01:00
Rhys Kidd	684e7a4a14	doxygen: Correct TAGFILE relative paths Per Doxygen documentation, to combine external documentation (stored in a *.tag file) with a project the TAGFILES option should be set in the configuration file. A tag file typically only contains a relative location of the documentation from the point where doxygen was run. So when you include a tag file in other project you have to specify where the external documentation is located in relation this project. You can do this in the configuration file by assigning the (relative) location to the tag files specified after the TAGFILES configuration option. If you use a relative path it should be relative with respect to the directory where the HTML output of your project is generated; so a relative path from the HTML output directory of a project to the HTML output of the other project that is linked to. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:04 +01:00
Rhys Kidd	f066fb529b	doxygen: Fix doxygen/glapi.doxy The src/mesa/glapi folder was relocated in the below commit. Amend the doxygen/glapi.doxy INPUT setting accordingly. Whilst here, in addition this change also avoids a bug in the consolidated Doxygen output caused by doxygen/glapi.doxy inadvertently overwriting doxygen/swrast.tag via its GENERATE_TAGFILE setting. This bug depended upon the specific order each *.tag was built. commit `296adbd545` Author: Chia-I Wu <olv@lunarg.com> Date: Mon Apr 26 12:56:44 2010 +0800 glapi: Move to src/mapi/. Move glapi to src/mapi/{glapi,es1api,es2api}. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:43:58 +01:00
Rhys Kidd	cf3bc91c06	doxygen: Remove src/mesa/shader/ references Mesa has not had a src/mesa/shader/ folder since Mesa 7.9 removed it in October 2010, as part of a revised GLSL compiler written by Intel. Remove doxygen/shader.doxy and consequential changes made throughout. In addition to removing an unnecessary Doxygen doxyfile, this change also avoids a bug in the consolidated Doxygen output caused by doxygen/shader.doxy inadvertently overwriting doxygen/swrast.tag via its GENERATE_TAGFILE setting. This bug depended upon the specific order each *.tag was built. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:43:54 +01:00
Marek Olšák	04f15e491f	gallium/radeon: add an env variable to force a level of aniso filtering Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-13 12:42:28 +02:00
Jose Fonseca	cc5d8b678e	llvmpipe: Test rounding of x.5. Leverage nearbyintif function, which should be available on all C99 implementations. Trivial.	2016-04-13 11:13:05 +01:00
Roland Scheidegger	cb438d8b3e	gallivm: use llvm.nearbyint instead of llvm.round. We used to use sse roundps intrinsic directly, but switched to use the llvm intrinsics for rounding with `e4f01da15d`. However, llvm semantics follows standard math lib round function which is specced to do roundNearestAwayFromZero but we really want roundNearestEven (moreoever, using round generates atrocious code since the cpu can't do it directly and it results in scalar calls to libm __roundf). So, use llvm.nearbyint instead, which does exactly the right thing, and even has the advantage of being available with llvm 3.3 too. (I've verified it actually generates a roundps instruction with llvm 3.3.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94909 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-13 11:13:03 +01:00
Pierre Moreau	f525db6358	nv50/ra: `isinf()` is in namespace `std` since C++11. This fixes a compile error while building Nouveau with C++11 enabled (and glibc >= 2.23). This happens if SWR is enabled, as it forces C++11. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Jose Fonseca <jfonseca@vmware.com> https://bugs.freedesktop.org/show_bug.cgi?id=94907	2016-04-13 07:41:13 +01:00
Jose Fonseca	fa46848e51	scons: Allow building with Address Sanitizer. libasan is never linked to shared objects (which doesn't go well with -z,defs). It must either be linked to the main executable, or (more practically for OpenGL drivers) be pre-loaded via LD_PRELOAD. Otherwise works. I didn't find anything with llvmpipe. I suspect the fact that the JIT compiled code isn't instrumented means there are lots of errors it can't catch. But for non-JIT drivers, the Address/Leak Sanitizers seem like a faster alternative to Valgrind. Usage (Ubuntu 15.10): scons asan=1 libgl-xlib export LD_LIBRARY_PATH=$PWD/build/linux-x86_64-debug/gallium/targets/libgl-xlib LD_PRELOAD=libasan.so.2 any-opengl-application Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-04-13 06:54:32 +01:00
Kenneth Graunke	d1c89f6005	mesa: Change an error code in glSamplerParameterI[iu]v(). This is supposed to be INVALID_OPERATION in ES. We already did this for the fv/iv variants, but not Iiv/Iuv, which are new in ES 3.2 (or extensions). Fixes: ES31-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-12 20:30:32 -07:00
Jose Fonseca	46bfcd61f5	softpipe: Free tgsi.image elements on context destruction. Courtesy of address sanitizer. [airlied: free buffers as well] Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 13:21:37 +10:00
Edward O'Callaghan	5a3d928e2c	softpipe: Enable ARB_framebuffer_no_attachments Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 13:21:37 +10:00
Eric Anholt	3b63301d9f	vc4: Work around hardware limits on the number of verts in a single draw. Fixes rendering failures in glmark2's refract and bump:render-mode=high-poly demos, and partially in its terrain demo.	2016-04-12 19:10:51 -07:00
Thomas Hindoe Paaboel Andersen	6d6525a377	softpipe: avoid buffer overflow Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 11:51:35 +10:00
Thomas Hindoe Paaboel Andersen	b89708f95f	tgsi: fix buffer overflow Increase r to four channels as rgba is written to it Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 11:51:34 +10:00
Tim Rowley	b9294bc345	swr: handle pci cap requests Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-04-12 20:18:00 -05:00
Tim Rowley	b19d214b23	swr: support samplers in vertex shaders Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-04-12 20:18:00 -05:00
Nicolai Hähnle	10cfd7a604	radeonsi: enable GLSL 4.20 and therefore OpenGL 4.2 This is the last necessary bit for OpenGL 4.2 support. All driver-specific functionality has already been implemented as part of extensions. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 20:13:49 -05:00
Iurie Salomov	047e3264f6	va: check null context in vlVaDestroyContext Signed-off-by: Iurie Salomov <iurcic@gmail.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com>	2016-04-13 00:52:53 +01:00
Jason Ekstrand	8f3b516f2e	nir/clone: Copy bit size when cloning registers Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-12 16:41:58 -07:00
Marek Olšák	8e70a58af3	radeonsi: fix a critical SI hang since PIPELINESTAT_START/STOP was added For some reason unknown to me, SI hangs if the event is written after CONTEXT_CONTROL.	2016-04-13 01:05:15 +02:00
Kenneth Graunke	95d622e16d	glsl: Don't copy propagate or tree graft precise values. This is kind of a hack. We currently track precise requirements by decorating ir_variables. Propagating or grafting the RHS of an assignment to a precise value into some other expression tree can lose those decorations. In the long run, it might be better to replace these ir_variable decorations with an "exact" decoration on ir_expression nodes, similar to what NIR does. In the short run, this is probably good enough. It preserves enough information for glsl_to_nir to generate "exact" decorations, and NIR will then handle optimizing these expressions reasonably. Fixes ES31-CTS.gpu_shader5.precise_qualifier. v2: Drop invariant handling, as it shouldn't be necessary (caught by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-12 15:57:48 -07:00
Mark Janes	9e351e077b	util: Fix race condition on libgcrypt initialization Fixes intermittent Vulkan CTS failures within the test groups: dEQP-VK.api.object_management.multithreaded_per_thread_device dEQP-VK.api.object_management.multithreaded_per_thread_resources dEQP-VK.api.object_management.multithreaded_shared_resources Signed-off-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94904 Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-12 15:38:43 -07:00
Kristian Høgsberg Kristensen	8ec971a997	i965/tiled_memcpy: Fix rgba8_copy_16_aligned_dst() typo Copy and paste error in commit `eafeb8db66`: i965/tiled_memcpy: Unroll bytes==64 case. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-12 15:32:43 -07:00
Kristian Høgsberg Kristensen	1af0f0151c	glsl/linker: Recurse on struct fields when adding shader variables ARB_program_interface_query requires that we add struct fields recursively down to basic types. Fixes 52 struct test cases in dEQP-GLES31.functional.program_interface_query.* Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen	778fd46aa4	glsl/linker: Pass name and type through to create_shader_variable() No functional change here, but this now lets us recurse throught structs in add_shader_variable(). Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen	09f0121593	glsl/linker: Pass absolute location to add_shader_variable() This lets us pass in the absolution location of a variable instead of computing it in add_shader_variable() based on variable location and bias. This is in preparation for recursing into struct variables. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen	8ab6aae4dc	glsl/linker: Add add_shader_variable() helper This consolidates the combination of create_shader_variable() and add_program_resource() into a new helper function. No functional difference, but we'll expand add_shader_variable() in the next few commits. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Matt Turner	eafeb8db66	i965/tiled_memcpy: Unroll bytes==64 case. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 14:37:05 -07:00
Roland Scheidegger	0e605d9b3a	i965/tiled_memcpy: Provide SSE2 for RGBA8 <-> BGRA8 swizzle. The existing code uses SSSE3, and because it isn't compiled in a separate file compiled with that, it is usually not used (that, of course, could be fixed...), whereas SSE2 is always present with 64-bit builds. This should be pretty much as fast as the pshufb version, albeit those code paths aren't really used on chips without llc in any case. v2: fix andnot argument order, add comments v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments v4: [mattst88] Rebase Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-12 14:37:01 -07:00
Matt Turner	fc88b4babf	i965/tiled_memcpy: Move SSSE3 code back into inline functions. This will make adding SSE2 code a lot cleaner. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 14:36:59 -07:00
Matt Turner	0a5d8d9af4	i965/tiled_memcpy: Optimize RGBA -> BGRA swizzle. Replaces four byte loads and four byte stores with a load, bswap, rotate, store; or a movbe, rotate, store. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 14:36:56 -07:00
Nicolai Hähnle	a191e6b719	radeonsi: fix bounds check in si_create_vertex_elements This was triggered by dEQP-GLES3.functional.vertex_array_objects.all_attributes Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-12 16:32:46 -05:00
Nicolai Hähnle	4285a97cea	docs: mark atomic counters and SSBOs as done for radeonsi Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:51 -05:00
Nicolai Hähnle	bfd11c5996	radeonsi: enable shader buffer pipe caps Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:48 -05:00
Nicolai Hähnle	4e81843b13	radeonsi: add shader buffer support to TGSI_OPCODE_RESQ Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:45 -05:00
Nicolai Hähnle	01109282ce	radeonsi: add shader buffer support to TGSI_OPCODE_STORE Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:43 -05:00
Nicolai Hähnle	745014c502	radeonsi: add shader buffer support to TGSI_OPCODE_LOAD Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:41 -05:00
Nicolai Hähnle	68bc25c931	radeonsi: add shader buffer support to TGSI_OPCODE_ATOM* Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:38 -05:00
Nicolai Hähnle	c6f5d000db	radeonsi: add offset parameter to buffer_append_args Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:35 -05:00
Nicolai Hähnle	c565466eea	radeonsi: adjust buffer_append_args to take a 128 bit resource Move the buffer resource extraction code out into its own function. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:32 -05:00
Nicolai Hähnle	e88018ffe5	radeonsi: preload shader buffers in shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:29 -05:00
Nicolai Hähnle	c495c0ad37	radeonsi: implement set_shader_buffers Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:26 -05:00
Nicolai Hähnle	73c8b85b64	radeonsi: move resetting of constant buffers into a separate function This will be re-used for shader buffers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:04 -05:00
Haixia Shi	35ade36c88	dri/i965: fix incorrect rgbFormat in intelCreateBuffer(). It is incorrect to assume that pixel format is always in BGR byte order. We need to check bitmask parameters (such as \|redMask\|) to determine whether the RGB or BGR byte order is requested. v2: reformat code to stay within 80 character per line limit. v3: just fix the byte order problem first and investigate SRGB later. v4: rebased on top of the GLES3 sRGB workaround fix. v5: rebased on top of the GLES3 sRGB workaround fix v2. Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:06:45 -07:00
Kenneth Graunke	e303e88a9c	glsl: Reject illegal qualifiers on atomic counter uniforms. This fixes dEQP-GLES31.functional.uniform_location.negative.atomic_fragment dEQP-GLES31.functional.uniform_location.negative.atomic_vertex Both of which have lines like layout(location = 3, binding = 0, offset = 0) uniform atomic_uint uni0; The ARB_explicit_uniform_location spec makes a very tangential mention regarding atomic counters, but location isn't something that makes sense with them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-12 14:06:42 -07:00
Kenneth Graunke	929e44099f	glsl: Add a method to print error messages for illegal qualifiers. Suggested by Timothy Arceri a while back on mesa-dev: https://lists.freedesktop.org/archives/mesa-dev/2016-February/107735.html Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com>	2016-04-12 14:06:42 -07:00
John Sheu	7f08547248	xlib: fix memory leak on Display close The XMesaVisual instances freed in the visuals table on display close are being freed with a free() call, instead of XMesaDestroyVisual(), causing a memory leak. Signed-off-by: John Sheu <sheu@google.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-12 13:56:41 -06:00
Jakob Sinclair	d04bb14d04	st/mesa: Replace GLvoid with void GLvoid was used before in OpenGL but it has changed to just using void. All GLvoids in mesa's state tracker has been changed to void in this patch. Tested this with piglit and no problems were found. No compiler warnings. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-12 13:37:16 -06:00
Bas Nieuwenhuizen	126da23d70	radeonsi: Mark ARB_robust_buffer_access_behavior as supported. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-12 20:53:10 +02:00
Bas Nieuwenhuizen	70dcd841f7	gallium: Add capability for ARB_robust_buffer_access_behavior. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 20:53:06 +02:00
Bas Nieuwenhuizen	285dc05055	mesa: Expose the ARB_robust_buffer_access_behavior extension. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-12 20:40:26 +02:00
Miklós Máté	aad8707b28	main: rework the compatibility check of visuals in glXMakeCurrent Now it follows the compatibility criteria listed in section 2.1 of the GLX 1.4 specification. This is needed for post-process effects in SW:KotOR. Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-04-12 19:48:01 +02:00
Tim Rowley	df37b06276	swr: [rasterizer core] warning cleanup Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	06c59dc417	swr: [rasterizer] Put in rudimentary garbage collection for the global arena allocator - Check for unused blocks every few frames or every 64K draws - Delete data unused since the last check if total unused data is > 20MB Doesn't seem to cause a perf degridation Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	b990483de2	swr: [rasterizer core] Put DRAW_CONTEXT on a diet No need for 256 pointers per DC. Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	a939a58881	swr: [rasterizer core] Add experimental support for hyper-threaded front-end Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	9a8146d0ff	swr: [rasterizer] Avoid segv in thread creation on machines with non-consecutive NUMA topology. Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	2c71fd4bf8	swr: [rasterizer core] Replace all naked OSALIGN macro uses with OSALIGNSIMD / OSALIGNLINE Future proofing Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	32a8653ad2	swr: [rasterizer] Ensure correct alignment of stack variables used as vectors Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	e1871c4459	swr: [rasterizer core] Quantize depth to depth buffer precision prior to depth test/write. Fixes z-fighting issues. Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	2a19aca05f	swr: [rasterizer common] win32 build fixups Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	c25244f2f7	swr: [rasterizer core] Affinitize thread scratch space to numa node of worker Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:04 -05:00
Tim Rowley	f89f6d562a	swr: [rasterizer] Misc fixes identified by static code analysis No perf loss detected Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:04 -05:00
Brian Paul	6c01478213	st/mesa: fix memleak in glDrawPixels cache code If the glDrawPixels size changed, we leaked the previously cached texture, if there was one. This patch fixes the reference counting, adds a refcount assertion check, and better handles potential malloc() failures. Tested with a modified version of the drawpix Mesa demo which changed the image size for each glDrawPixels call. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-04-12 10:44:45 -06:00
Jose Fonseca	b5105e67a8	gallium: Use STATIC_ASSERT whenever possible. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 16:56:15 +01:00
Jose Fonseca	b025c23cfe	softpipe: Use STATIC_ASSERT whenever possible. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 16:56:15 +01:00
Jose Fonseca	2f13d7543f	svga: Use STATIC_ASSERT whenever possible. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 16:56:15 +01:00
Jose Fonseca	7279098dc5	mesa: Use STATIC_ASSERT whenever possible. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 16:56:15 +01:00
Marek Olšák	686b018ab3	r600g: use common scissor and viewport code It's the same as radeonsi. This adds guard band support to r600g. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 17:13:25 +02:00
Marek Olšák	87a5b07f90	gallium/radeon: add R600/Evergreen/Cayman support to common viewport code Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 17:13:25 +02:00
Marek Olšák	2ca5566ed7	radeonsi: move scissor and viewport states into gallium/radeon Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 17:13:24 +02:00
Marek Olšák	db00f6cc9c	radeonsi: use guard band clipping Guard band clipping speeds up rasterization for primitives that are partially off-screen. This change in particular results in small framerate improvements in a wide range of games. Started by Grigori Goronzy <greg@chown.ath.cx>. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 17:12:14 +02:00
Marek Olšák	cb21f8a97c	radeonsi: compute scissor from viewport in set_viewport_states and clamp it right before emitting. This is a prerequisite for computing the guard band. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:49 +02:00
Marek Olšák	5b6a0b7fc0	gallium/radeon: set GTT WC on tiled textures Just for consistency. This should have no effect, because OpenGL textures always go to VRAM. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	5a4b74d1ba	gallium/radeon: relax requirements on VRAM placements on APUs This makes Tonga with vramlimit=128 2x faster in Heaven. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	a57309f807	winsys/amdgpu: remove hack for low VRAM configuration A better solution will be used. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	b36f19bf98	r600g: disable aniso filtering for non-mipmap textures on EG this is the default behavior of the closed driver when running on VI Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	3bc2d967c4	r600g: clean up aniso state translation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	b0d4469519	radeonsi: disable aniso filtering for non-mipmap textures on SI-CI The closed driver does this, but it looks at base_level and last_level and uses a conditional assignment, which LLVM can't generate on SGPRs. That led me to invent this solution that abuses the image descriptor. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	ddd33431c5	radeonsi: clean up aniso state translation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	f7420ef5b4	radeonsi: enable some sampler fields to match the closed driver copied from the Vulkan driver Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	1a98be001f	gallium/radeon: fix maximum texture anisotropy setup We were overdoing it for non-power-of-two values. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	2d7be5d37e	gallium/radeon: never choose a linear tiling for DB surfaces Just for consistency. This is actually not a problem, because both addrlib and radeon check and fix this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	b7878146c4	gallium/radeon: removing dead code for sharing stencil buffers This is a remnant of the times when the DDX was allocating depth-stencil buffers for windows. Now, st/dri allocates them and doesn't share them. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	73aeebd772	radeonsi: allow clearing buffers >= 4 GB Only CMASK and DCC clears can use this, because only textures can be so large. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	1dd8832e04	gallium/radeon: allow allocating textures >= 4 GB Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	0689741e51	winsys/radeon: fix printing allocation failures print as unsigned instead of signed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	0ba0933f48	winsys/amdgpu: add support for 64-bit buffer sizes v2: fail in radeon_winsys_bo_create if size > 32 bits Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	7e78b5ed38	pb_buffer: switch pb_buffer::size to 64 bits being able to allocate more than 4 GB may be useful Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	e241a63512	gallium/radeon: remove R600_QUERY_HW_FLAG_TIMER not used anymore Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	0222351fc1	gallium/radeon: merge timer and non-timer query lists All of them are paused only between IBs. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	7347c068d8	r600g: don't manually stop queries for blitter r600_set_active_query_state does it better. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	12fee5b93e	r600g: add pausing pipeline & streamout queries into set_active_query_state Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	e90fe60b72	r600g: implement set_active_query_state for pausing occlusion queries Use ZPASS_INCREMENT_DISABLE everywhere. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	5248676f87	r600g: simplify r600_set_occlusion_query_state The caller does the same checking. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	b82893f93a	gallium/radeon: move pipeline stat context flags to common code Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	aa79a3269f	r600g: fix typo in r600 register definitions Acked-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	a4c288d8e1	gallium/radeon: unify checking streamout enable state Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	466aa57185	radeonsi: fix mask checking when emitting scissors and viewports Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>	2016-04-12 14:29:46 +02:00
Marek Olšák	f3eebb84eb	radeonsi: implement and rely on set_active_query_state Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:46 +02:00
Marek Olšák	e599b8f384	gallium: pause queries for all meta ops Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:46 +02:00
Marek Olšák	26171bd67e	gallium: add pipe_context::set_active_query_state for pausing queries Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:46 +02:00
Bas Nieuwenhuizen	fc67375379	radeonsi: Synchronize a streamout write after read hazard. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-12 13:55:38 +02:00
Hans de Goede	dccdb655a1	nv30: Add missing PIPE_SHADER_CAP_INTEGERS to get_shader_param() Add missing PIPE_SHADER_CAP_INTEGERS for frag shaders to nv30_screen_get_shader_param(). Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-12 11:41:12 +02:00
Haixia Shi	b0e3ba61b5	dri/i965: extend GLES3 sRGB workaround to cover all formats It is incorrect to assume BGRA byte order for the GLES3 sRGB workaround. v2: use _mesa_get_srgb_format_linear to handle all formats Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 02:06:12 -07:00
Eduardo Lima Mitev	ea8a65f503	i965: Add autogenerated 'brw_nir_trig_workarounds.c' to gitignore Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 10:44:19 +02:00
Rhys Kidd	703c1e69d8	glsl: Update hash table comments in constant propagation Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 01:29:19 -07:00
Dave Airlie	afa8707ba9	softpipe: add SSBO/shader atomics support. This adds support for the features requires for ARB_shader_storage_buffer_object and ARB_shader_atomic_counters, ARB_shader_atomic_counter_ops. [airlied: some cleanups applied] Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-12 14:16:13 +10:00
Dave Airlie	c2aeeca455	draw: add support for passing buffers to vs/gs shaders. Like the image code, but for shader buffers this time. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-12 14:15:36 +10:00
Dave Airlie	081a958bcd	tgsi: add support for buffer/atomic operations to tgsi_exec. This adds support for doing load/store/atomic operations on buffer objects. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-12 14:15:33 +10:00
Dave Airlie	9c7a0d188a	tgsi: set nonhelpermask for vertex shaders For atomic operations we really need to avoid executing unnecessary shaders, so for some tests that just draw a single point we only want one vertex to get processed not 4, this fixes a number of the atomic counters tests. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-12 14:15:16 +10:00
Ian Romanick	193a5cee6a	nir: Fix typo in comment Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-11 19:24:19 -07:00
Markus Wick	18c8b927e2	nir: Merge redudant integer clamping. Dolphin uses them a lot. Range tracking would be better in the long term, but this two lines works fine for now. Signed-off-by: Markus Wick <markus@selfnet.de> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 18:48:50 -07:00
Kenneth Graunke	bfd17c76c1	i965: Port INTEL_PRECISE_TRIG=1 to NIR. This makes the extra multiply visible to NIR's algebraic optimizations (for constant reassociation) as well as constant folding. This means that when the result of sin/cos are multiplied by an constant, we can eliminate the extra multiply altogether, reducing the cost of the workaround. It also means we only have to implement it one place, rather than in both backends. This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion, which has a ton of sin() calls, but always multiplies them by an immediate constant. The extra multiply gets folded away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 18:44:17 -07:00
Kenneth Graunke	b0dffdc616	i965: Pass brw_compiler into brw_preprocess_nir() instead of is_scalar. I want to be able to read other fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 18:44:12 -07:00
Kenneth Graunke	808d26c771	nir: Silence unused "options" warning in algebraic passes. Some passes may not refer to options->..., at which point the compiler will warn about an unused variable. Just cast to void unconditionally to shut it up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 18:44:08 -07:00
Kenneth Graunke	5886cd79a0	nir: Do basic constant reassociation. Many shaders contain expression trees of the form: const_1 * (value * const_2) Reorganizing these to (const_1 * const_2) * value will allow constant folding to combine the constants. Sometimes, these constants are 2 and 0.5, so we can remove a multiply altogether. Other times, it can create more immediate constants, which can actually hurt. Finding a good balance here is tricky. While much more could be done, this simple patch seems to have a lot of positive benefit while having a low downside. shader-db results on Broadwell: total instructions in shared programs: 8963768 -> 8961369 (-0.03%) instructions in affected programs: 438318 -> 435919 (-0.55%) helped: 1502 HURT: 245 total cycles in shared programs: 71527354 -> 71421516 (-0.15%) cycles in affected programs: 11541788 -> 11435950 (-0.92%) helped: 3445 HURT: 1224 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 18:43:55 -07:00
Boyuan Zhang	1c7ba7f156	radeon/uvd: alignment fix for decode message buffer Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-04-11 19:30:47 -04:00
Brian Paul	704d203d5f	st/mesa: replace _mesa_sysval_to_semantic table with function Instead of using an array indexed by SYSTEM_VALUE_x, just use a switch statement. This fixes a regression caused by inserting new SYSTEM_VALUE_ enums but not updating the mapping to TGSI semantics. v2: fix a few switch statement mistakes for compute-related enums Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-11 17:04:13 -06:00
Jason Ekstrand	a9e6213edd	nir/lower_system_values: Add support for several computed values Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-11 13:53:03 -07:00
Jason Ekstrand	39103145ff	glsl/shader_enums: Add the other two compute builtins These weren't added before because they are actually calculated values that are computed from other inputs. However, in order to handle them in nir_lower_system_values, it's nice for them to have a cannonical locaiton. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-11 13:53:00 -07:00
Jason Ekstrand	22836dbefa	glsl/shader_enums: Add an enum for Vulkan InstanceIndex In Vulkan, you have InstanceIndex which begins at the base instance value rather than the zero-based InstanceID of GL. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-11 13:52:51 -07:00
Emil Velikov	581c8016f8	mesa: add missing header to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-11 19:08:23 +01:00
Emil Velikov	5e010a72c9	drivers/softpipe: add missing header to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-11 19:08:23 +01:00
Emil Velikov	c69ab885d7	mesa: automake: update and reuse X86_SSE41_FILES list Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 19:08:23 +01:00
Emil Velikov	28da0d6922	compiler: android: flesh out nir into separate makefile Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 19:08:23 +01:00
Emil Velikov	8d51500b2d	compiler: automake: flesh out NIR into separate makefile. Analogous to previous commit - improved readability at the expense of an extra file. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 19:08:23 +01:00
Emil Velikov	9324afc0e9	compiler: automake: split out glsl into separate makefile Preserve the functionality while keeping the files smaller and more readable. v2: Do not include Makefile.sources from the GLSL makefile (silences automake warnings) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-04-11 19:08:23 +01:00
Emil Velikov	3d67780b80	compiler: remove {glsl,nir}/Makefile.sources No longer used as of last commit. v2: Rebase. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-04-11 19:08:23 +01:00
Emil Velikov	c481c8f7f1	configure.ac: update the path of the generated files ... in order to determine if we need bison/flex. Failing to locate the files will lead to mandating bison/flex even when building from a release tarball. CC: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 19:08:23 +01:00
Emil Velikov	4db8f15a25	glsl: move the android build scripts a level up Analogous to previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 19:08:23 +01:00
Emil Velikov	abf7088eb7	glsl: move the scons build script a level up It will allow us to remove the duplicate glsl/Makefile.sources. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 19:08:23 +01:00
Emil Velikov	594e868555	Part revert "gallium/auxiliary: don't build NIR sources with MSVC2008 flags" This reverts commit `41c7912d04` but leaves out the pragma [that inspired the original commit]. Building mesa requires MSVC2013 or later, thus we no longer need this. v2: Use correct include path (src/glsl/nir -> src/compiler/nir) Conflicts: src/gallium/auxiliary/Makefile.am Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-04-11 19:08:23 +01:00
Nicolai Hähnle	590a37dc05	GL3: ARB_shader_image_load_store/size is done for radeonsi also in GLES Trivial.	2016-04-11 12:48:10 -05:00
Brian Paul	05aec42d3d	docs: fix Coverity URL	2016-04-11 09:10:39 -06:00
Oded Gabbay	d97f5d60f5	tgsi/doc: fix spelling error Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-11 11:43:43 +03:00
Jason Ekstrand	3aa1a5ee88	nir/lower_system_values: Simplify the computation of LocalInvocationIndex	2016-04-10 23:43:38 -07:00
Connor Abbott	a89c474157	nir: add a pass for lowering (un)pack_double_2x32 v2: Undo unintended change to the signature of nir_normalize_cubemap_coords (Iago). v3: Move to compiler/nir (Iago) v4: Remove Authors from copyright header (Michael Schellenberger) v5 (Sam): - Use nir_channel() and nir_ssa_for_alu_src() helpers (Jason) - Inline lower_double_pack_instr() code into lower_double_pack_block() (Jason). - Initialize nir_builder at lower_double_pack_impl() (Jason). Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	663e6421df	nir: add split versions of (un)pack_double_2x32 v2 (Sam): - Use uint64 instead of float64 for sources and destinations. (Connor) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	b093808d26	nir: don't try to scalarize unpack_double_2x32 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	9e31e0a21b	nir: add support for (un)pack_double_2x32 v2 (Sam): - Use uint64 instead of float64 for sources and destinations. (Connor) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Iago Toral Quiroga	d5d6260329	nir: add i2d and u2d opcodes v2: - Assert supports_int and don't fallback to nir_fmov (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Iago Toral Quiroga	b16d06252e	nir: add d2i, d2u, d2b opcodes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	a4bce07dc6	nir: add support for d2f and f2d Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Iago Toral Quiroga	fab5d4cd95	nir/glsl_to_nir: set bit_size on ssbo_load result v2 (Sam): - Add missing bit_size assignment when ssbo_load destination is a boolean. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Samuel Iglesias Gonsálvez	a741378cb5	nir/glsl_to_nir: add bit-size info to add_instr() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:28:01 +02:00
Connor Abbott	4b37c64f3b	nir/split_var_copies: handle doubles Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	106a1b5501	nir/instr_set: handle 64-bit bit-sizes v2: Revert spurious change in nir_opt_cse.c (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	f2ccb63be1	nir: handle doubles in nir_deref_get_const_initializer_load() v2 (Sam): - Use proper bitsize value when calling to nir_load_const_instr_create() (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	41c2541fc7	nir/print: add support for printing doubles and bitsize v2: - Squash the printing doubles related patches into one patch (Sam). v3: - Print using PRIx64 format: long is 32-bit on some 32-bit platforms but long long is basically always 64-bit (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	f5551f8a8b	nir/glsl_to_nir: support doubles v2: - Don't set sized types to the destination of texture related opcodes. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Iago Toral Quiroga	8e69782e3e	nir/lower_load_const_to_scalar: support doubles and multiple bit sizes v2 (Sam): - Add assert to detect bitsizes differents than 32 and 64 (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Iago Toral Quiroga	12f628adcb	nir/lower_to_source_mods: Handle different bit sizes v2 (Sam): - Use helper to get base type from nir_alu_type. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Samuel Iglesias Gonsálvez	3663a2397e	nir: add bit_size info to nir_load_const_instr_create() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Connor Abbott	a5b17ae745	nir/lower_vec: adapt to different bit sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Samuel Iglesias Gonsálvez	e3edaec739	nir: add bit_size info to nir_ssa_undef_instr_create() v2: - Make the users to give the right bit_sizes as arguments (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Connor Abbott	41a39e3384	nir/locals_to_regs: adapt to different bit sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Connor Abbott	40d1b671a9	nir/from_ssa: adapt to different bit sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Timothy Arceri	4979cec820	i965: fix struct type in comment Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-11 14:03:09 +10:00
Jason Ekstrand	7d58cfa366	nir: Add a pass for gathering various bits of shader info Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-10 20:43:47 -07:00
Ilia Mirkin	875543e270	i965: enable OES_texture_buffer on gen7+ It will only end up getting exposed on gen8+ since it requires GL ES 3.1, but it should be ready to go on gen7 when support for GL ES 3.1 is completed there. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-10 20:24:26 -07:00
Dave Airlie	6f5f818b6d	docs: add some missing softpipe entries. I just forgot these when I added this stuff. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-11 13:14:48 +10:00
Kenneth Graunke	26c56e24e7	glsl: Don't remove XFB-only varyings. Consider the case of linking a program with both a vertex and fragment shader. The VS may compute output varyings that are intended for transform feedback, and not read by the fragment shader. In this case, var->data.is_unmatched_generic_inout will be true, but we still cannot eliminate the varyings. We need to also check !var->data.is_xfb_only. Fixes failures in ES31-CTS.gpu_shader5.fma_precision_*, which happen to use transform feedback in a way we apparently hadn't seen before. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-10 19:03:06 -07:00
Kenneth Graunke	ce84a92df5	i965/disasm: Decode per-slot offsets. We just never bothered to decode this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-04-09 21:10:32 -07:00
Kenneth Graunke	20c8f36508	i965/disasm: Decode "channel mask present" bit correctly. Bit 15 means "interleave" for most messages, but for SIMD8 messages it means "use channel masks". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-04-09 21:10:20 -07:00
Kenneth Graunke	b790232524	i965/disasm: Simplify the URB opcode printing with ?:. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-04-09 21:10:11 -07:00
Ilia Mirkin	9b5bd20eb2	glsl: allow usage of the keyword buffer before GLSL 430 / ESSL 310 The GLSL 4.20 and ESSL 3.00 specs don't list 'buffer' as a reserved keyword. Make the parser ignore it unless GLSL 4.30 / ESSL 3.10 are used, or ARB_shader_storage_buffer_objects is enabled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-04-09 20:41:54 -04:00
Jason Ekstrand	bff7a8c4f3	anv/pipeline: Set up flat enables correctly	2016-04-09 17:06:59 -07:00
Jason Ekstrand	1275c7c744	genxml: Fix the name of a 3DSTATE_SF/SBE field on gen6-7.5	2016-04-09 17:02:21 -07:00
Jason Ekstrand	aa6f9a4e1e	genxml: Break output detail of 3DSTATE_SF on gen7 into a struct This makes it work like 3DSTATE_SBE[_SWIZ] on gen7+	2016-04-09 17:00:22 -07:00
Jason Ekstrand	ddae342618	genxml: Fix up MOCS in RENDER_SURFACE_STATE on gen6 to match gen7	2016-04-09 16:59:04 -07:00
Ilia Mirkin	cdb6fa91fa	nvc0: handle the case where there are no framebuffer attachments Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-09 14:55:44 -04:00
Ilia Mirkin	59ca92137b	nv50,nvc0: support sending string markers down into the command stream This should hopefully make it a little easier to debug with GL applications like glretrace and looking at command streams. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-09 14:55:43 -04:00
Ilia Mirkin	f9480d7918	nv50,nvc0: add invalidate_resource support for buffer resources Provide a callback to reallocate the underlying storage of a resource so that it is not bound to any existing fences. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-09 14:55:43 -04:00
Eric Anholt	30b818d5eb	vc4: Move FRAG_X/Y/REV_FLAG to a QFILE like VPM or TLB color writes. This gives us one less set of special instruction generation cases, and instead just the case for returning the correct register to read.	2016-04-08 18:41:46 -07:00
Eric Anholt	f029932cac	vc4: Allow TLB Z/color/stencil writes from any ALU operation in QIR. This lets us write the Z directly from the FTOI for computed Z, and may let us coalesce color writes in the future. No change in my shader-db, but clearly drops an instruction in piglit's early-z test.	2016-04-08 18:41:46 -07:00
Eric Anholt	44d7b8ad12	vc4: Add a helper function for the construction of qregs. The separate declaration of the struct is not helping clarity, and I was going to be writing a whole lot more of these in the upcoming patches.	2016-04-08 18:41:45 -07:00
Eric Anholt	114c8b38d3	vc4: Add missing scheduling dependency for MS color writes.	2016-04-08 18:41:45 -07:00
Eric Anholt	483c172989	vc4: Drop the multi_instruction distinction for QIR instructions. It wasn't correctly flagged everywhere, and QPU generation now handles the only remaining case that was paying attention to it. No change on shader-db.	2016-04-08 18:41:45 -07:00
Eric Anholt	a8b525f8c4	vc4: Handle SF on instructions that write r4. Normal SFU writes couldn't have SF because they were marked as multi_instruction, but tex_result and tlb_color_read weren't. This ended up not being a problem according to anything in shader-db, but it seems possible.	2016-04-08 18:41:45 -07:00
Eric Anholt	e46b48963a	vc4: Allow multi-instruction QIR nodes to get VPM optimization. There used to be multi-instruction operations that would use src[] twice, which is why we couldn't do some optimizations on them. This is no longer the case. total instructions in shared programs: 77973 -> 77969 (-0.01%) instructions in affected programs: 84 -> 80 (-4.76%) total estimated cycles in shared programs: 234165 -> 234157 (-0.00%) estimated cycles in affected programs: 92 -> 84 (-8.70%)	2016-04-08 18:41:45 -07:00
Eric Anholt	99a759a4a3	vc4: Switch to using NIR_PASS macros. This gets us better validation of our NIR transformations.	2016-04-08 18:41:45 -07:00
Eric Anholt	7030eadbed	vc4: Handle nir_intrinsic_load_user_clip_plane as a vec4. I liked having all my NIR be scalar, but nir_validate() complains that the intrinsic writes 4 components but the destination we set up was only 1 component. I could generate a new scalar variant, but it's a lot easier to just leave it as a vec4. This doesn't hurt codegen since we GC unused uniforms, and UCP dot products use all the components anyway.	2016-04-08 18:40:55 -07:00
Rhys Kidd	40e77741cf	vc4: Emit a warning and proceed for handling loops in NIR. We don't really suppor control flow yet, but it's a lot nicer to render something and warn on stderr than to crash. Fixes the following piglit tests: - shaders/complex-loop-analysis-bug - shaders/glsl-fs-discard-04 Converts the following piglit tests from crash to fail: - shaders/glsl-fs-continue-inside-do-while - shaders/glsl-fs-loop - shaders/glsl-fs-loop-continue - shaders/glsl-fs-loop-nested - shaders/glsl-texcoord-array - shaders/glsl-vs-continue-inside-do-while - shaders/glsl-vs-loop - shaders/glsl-vs-loop-continue - shaders/glsl-vs-loop-nested No piglit regressions. v2 (Eric): Add stronger stderr warning. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-08 18:28:43 -07:00
Rhys Kidd	2450b219e5	vc4: Add a stub for NIR->QIR of control flow function nodes We shouldn't have any NIR functions present since all GLSL functions get inlined, but this would be a more informative error if it does happen. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-08 18:28:43 -07:00
Rhys Kidd	e5997778bc	vc4: Add better debug of NIR->QIR control flow graph failure Ensure NIR control flow graph nodes that are unhandled in QIR are reported with sufficient verbosity to aid debugging. This improves piglit outputs, amongst other tools. There are no other remaining uses of assert(0) as a blunt tool within vc4. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-08 18:28:43 -07:00
Rhys Kidd	e529dd179f	vc4: Remove unused include from vc4_program.c Found with grep and inspection. Test compiled on RPi hw. Assists any future effort to remove TGSI as an intermediate stage. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-08 18:28:43 -07:00
Lars Hamre	e25c24c638	glsl: handle unsigned int wraparound in link_shaders() v2: change check_explicit_uniform_locations() to return an unsigned 0 (Timothy Arceri) We were storing the int result of check_explicit_uniform_locations() in num_explicit_uniform_locs as an unsigned int which caused it to be 4294967295 when a -1 was returned. This in turn would cause the following error during linking: error: count of uniform locations > MAX_UNIFORM_LOCATIONS(4294967295 > 98304) Results from running piglit tests/all with this patch and when ARB_explicit_uniform_location disabled: changes: 178 fixes: 176 regressions: 2 The two regressions are for the following tests: glean@glsl1-matrix column check (1) glean@glsl1-matrix column check (2) which regress from FAIL to CRASH. The regressions are acceptable because the tests are currently failing due to the aforementioned linker error. Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-09 11:06:04 +10:00
Jason Ekstrand	d4a28ae52a	anv/meta: Make clflushes conditional on !devinfo->has_llc	2016-04-08 17:07:49 -07:00
Jason Ekstrand	c226e72a39	anv/formats: Advertise blit support for stencil Thanks to advances in the blit code, we can do this now. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:59:29 -07:00
Jason Ekstrand	e3312644cb	anv/blit2d: Add support for W-tiled destinations Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-08 15:59:26 -07:00
Jason Ekstrand	0a6842c1bd	isl/surface_state: Set the correct pitch for W-tiled surfaces Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:52 -07:00
Jason Ekstrand	2e827816fa	anv/blit2d: Add another passthrough varying to the VS We need the VS to provide some setup data for other stages. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:49 -07:00
Jason Ekstrand	b377c1d08e	anv/image: Remove the offset parameter from image_view_init The only place we were using this was in meta_blit2d which always creates a new image anyway so we can just use the image offset. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:45 -07:00
Jason Ekstrand	f9a2570a06	anv/blit2d: Add a bind_dst helper function Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:42 -07:00
Jason Ekstrand	15a9468d85	anv/blit2d: Simplify create_iview Now it just creates the image and view. The caller is responsible for handling the offset calculations. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:40 -07:00
Jason Ekstrand	b8f3909b73	nir/gather_info: Handle discard_if Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:36 -07:00
Jason Ekstrand	819d0e1a7c	anv/meta2d: Add support for blitting from W-tiled sources on gen7 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-08 15:58:03 -07:00
Jason Ekstrand	b0a5ca5cfc	isl: Remove surf_get_intratile_offset_el The intratile offset may not be a multiple of the element size so this calculation is invalid. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:01 -07:00
Jason Ekstrand	b37502b983	isl: Rework the get_intratile_offset function The old function tried to work in elements which isn't, strictly speaking, a valid thing to do. In the case of a non-power-of-two format, there is no guarantee that the x offset into the tile is a multiple of the format block size. This commit refactors it to work entirely in terms of a tiling (not a surface) and bytes/rows. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:58 -07:00
Jason Ekstrand	4caba94086	anv/image: Expose the guts of CreateBufferView for meta Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:55 -07:00
Jason Ekstrand	4ee80e8816	anv/blit2d: Refactor in preparation for different src/dst types Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:52 -07:00
Jason Ekstrand	85b9a007ac	anv/blit2d: Add layouts for using a texel buffer source Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:49 -07:00
Jason Ekstrand	28eb02e345	anv/blit2d: Rename the descriptor set and pipeline layouts Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:47 -07:00
Jason Ekstrand	00e70868ee	anv/blit2d: Enhance teardown and clean up init error paths Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:45 -07:00
Jason Ekstrand	43fbdd7156	anv/blit2d: Factor binding the source image into a helper Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:43 -07:00
Jason Ekstrand	5187ab05b8	anv/blit2d: Inline meta_emit_blit2d Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:41 -07:00
Jason Ekstrand	b0a6cfb9b4	anv/blit2d: Pass the source pitch into the shader Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:39 -07:00
Jason Ekstrand	e466164c87	anv/blit2d: Break the texelfetch portion of shader building into a helper Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:37 -07:00
Jason Ekstrand	afada45590	anv/blit2d: Fix whitespace Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:35 -07:00
Jason Ekstrand	9553fd2c97	anv/blit2d: Fix a NIR writemask Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:32 -07:00
Jason Ekstrand	b38a0d64ba	anv/meta2d: Don't declare an array sampler in the fragment shader With the new blit framework we aren't using array textures and, from talking with Nanley, we don't think it's going to be useful in the future either. Just get rid of it for now. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:28 -07:00
Jason Ekstrand	dd6f720046	anv/blit2d: Remove the tex_dim parameter from copy_fragment_shader Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:56:52 -07:00
Jason Ekstrand	6cc7aec5b0	i965/tiled_memcopy: Get rid of the direction parameter to get_memcpy Now that we can use the much simpler rgba8_copy function, we don't need to hand different functions out based on direction. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-08 12:09:20 -07:00
Jason Ekstrand	d2b32656e1	i965/tiled_memcpy: Rework the RGBA -> BGRA mem_copy functions This splits the two copy functions into three: One for unaligned copies, one for aligned sources, and one for aligned destinations. Thanks to the previous commit, we are now guaranteed that the aligned ones will only operate on aligned memory so they should be safe. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93962 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-08 12:09:15 -07:00
Jason Ekstrand	f6f54a29ca	i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions Each of the [de]tiling functions has three mem_copy calls: 1) Left edge to tile boundary 2) Tile boundary to tile boundary in a loop 3) Tile boundary to right edge Copies 2 and 3 start at a tile edge so the pointer to tiled memory is guaranteed to be at least 16-byte aligned. Copy 1, on the other hand, starts at some arbitrary place in the tile so it doesn't have any such alignment guarantees. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-08 12:08:51 -07:00
Ben Widawsky	e5295b5fb4	i965: Check eu/subslices are > 0 Now that the check is restricted to gen8+, we should always get back a non-zero positive value for the EU and subslice counts. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-08 11:52:29 -07:00
Ben Widawsky	cc01b63d73	i965: Fix eu/subslice warning Older gen platforms do not actually return a value for sublice and eu total (IMO, confusingly) they return -ENODEV. This patch defers the SSEU setup until we have the actual GPU generation to avoid useless warnings when running on older platforms with older kernels. Reported-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-08 11:52:29 -07:00
Ben Widawsky	4213b00e30	i965: Extract SSEU configuration info Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-08 11:51:01 -07:00
Brian Paul	4420f189b6	st/mesa: fix glReadBuffer() assertion failure If the first call in a GL app is glReadPixels(GL_FRONT) we'd fail the assert(st->ctx->FragmentProgram._Current) at st_atom_shader.c:114 in update_fp(). This is because we were calling st_validate_state() without first updating Mesa state with _mesa_update_state(). The regression came from commit `83b589301f` "st/mesa: fix frontbuffer glReadPixels regressions". The new piglit gl-1.0-simple-readbuffer test exercises this. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-08 09:49:05 -06:00
Thomas Hindoe Paaboel Andersen	b9855dcdf7	st/va: avoid dereference after free in vlVaDestroyImage Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-04-08 06:57:17 +01:00
Jason Ekstrand	e26a978773	Merge remote-tracking branch 'public/master' into vulkan	2016-04-07 16:56:34 -07:00
Jason Ekstrand	15895bf777	i965/fs: Use the scale helper in surface_builder As requested by Curro	2016-04-07 16:49:09 -07:00
Marek Olšák	1cd19ebc4a	radeonsi: do per-pixel clipping based on viewport states In other words, vport scissors are derived from viewport states. If the scissor test is enabled, the intersection of both is used. The guard band will disable clipping, so we have to clip per-pixel. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-08 00:23:05 +02:00
Samuel Pitoiset	059308db84	nv50/ir: do not try to attach JOIN ops to ATOM This might result in an INVALID_OPCODE dmesg error in case a join is attached to an atomic operation. Spotted with arb_shader_image_load_store-host-mem-barrier on GK104. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-04-07 23:10:26 +02:00
Nicolai Hähnle	2abe4f8d7d	radeonsi: raise number of samplers per shader to 32 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94835 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 13:15:06 -05:00
Nicolai Hähnle	9d2693f58a	radeonsi: expand the compressed color and depth texture masks to 64 bits This is in preparation of raising the number of exposed sampler views to 32 bits, which will raise the total number of sampler views to 33 for the polygon stipple texture. That texture should never be compressed (and it's certainly not a depth texture), but this approach seems cleaner to me than special-casing the last slot in all affected code paths. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 13:15:06 -05:00
Nicolai Hähnle	f270067ef9	radeonsi: replace magic 16 by SI_NUM_USER_SAMPLERS Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 13:15:06 -05:00
Nicolai Hähnle	f09036f6c0	gallium: raise PIPE_MAX_SAMPLERS to 32 The previous value of 18 was motivated by having drivers that want to expose 16 samplers but also use some additional samplers for internal use. Raising the value even higher isn't going to hurt that case. On the other hand, some drivers actually use PIPE_MAX_SAMPLERS as the number of samplers they expose externally, so raising this number above 32 is fragile (because several places in the code use bitfields, and tracking down and widening all of them is prone to miss some case). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 13:15:05 -05:00
Nicolai Hähnle	84c4d069ac	st/glsl_to_tgsi: make samplers_used an uint32_t (v2) It is used as a bitfield, so it seems cleaner to keep it unsigned. The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. v2: add an assert for bitfield size and use 1u << idx Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-04-07 13:15:05 -05:00
Nicolai Hähnle	4bfcc86bf9	tgsi/scan: add an assert for the size of the samplers_declared bitfield The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-07 13:15:05 -05:00
Nicolai Hähnle	cc39879989	draw/aaline: stronger guard against no free samplers (v2) Line anti-aliasing will fail when there is no free sampler available. Make the corresponding guard more robust in preparation of raising PIPE_MAX_SAMPLERS to 32. The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. v2: add an assert for bitfield size and use 1u << idx Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-04-07 13:15:05 -05:00
Nicolai Hähnle	040f5cb09e	util/pstipple: stronger guard against no free samplers (v2) When hasFixedUnit is false, polygon stippling will fail when there is no free sampler available. Make the corresponding guard more robust in preparation of raising PIPE_MAX_SAMPLERS to 32. The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. v2: add an assert for bitfield size and use 1u << idx Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-04-07 13:15:02 -05:00
Brian Paul	b7e67b2337	svga: new SVGA_MSAA env var to disable/enable MSAA pixel formats On by default. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-07 11:42:43 -06:00
Brian Paul	9f443af449	svga: add some trivial null pointer checks These small mallocs will probably never fail, but static analysis tools may complain about the missing checks. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-07 11:42:43 -06:00
Samuel Pitoiset	60cf2fa477	trace: add missing set_shader_images() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-07 18:52:27 +02:00
Marek Olšák	5fac4887d8	radeonsi: disable perfect ZPASS counts for PIPE_QUERY_OCCLUSION_PREDICATE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-07 13:58:01 +02:00
Marek Olšák	baa0b3f4cc	radeonsi: don't use the real barrier instruction in tess ctrl shaders Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-07 13:58:01 +02:00
Michel Dänzer	715e97e342	Revert "clover: Fix build against clang SVN >= r265359" This reverts commit `0daab9878d`. The corresponding clang change was reverted. Trivial.	2016-04-07 17:03:09 +09:00
Jason Ekstrand	05db680248	nir/types: Add a wrapper for count_attribute_slots Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-07 09:44:11 +02:00
Kristian Høgsberg Kristensen	068935844c	genxml: Add GEN6 genxml Not used yet, but let's put it here for now.	2016-04-06 21:08:34 -07:00
Dave Airlie	828d84c8e2	r600: use radeon_emit in a few more places in evergreen_compute This is just a cleanup of the code. Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:26 +01:00
Dave Airlie	0c40b6f96c	r600: make compute global buffer functions static. This moves things around so that the global buffer handling functions in evergreen_compute.c are static. Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:22 +01:00
Dave Airlie	a5d247dda0	r600: make two compute functions static. These aren't used outside evergreen_compute.c Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:17 +01:00
Dave Airlie	41558efa87	r600: using pipe_grid_info more in evergreen_compute. No reason to pull the pieces apart here, also make one of the functions static as it's unused outside this. Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:13 +01:00
Dave Airlie	a6e17d7d69	r600: in evergreen_compute use ctx consistently instead of ctx_ Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:09 +01:00
Dave Airlie	aeb2be3a2f	r600: use rctx consistently in evergreen_compute.c Another step towards cleaning this up. Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:05 +01:00
Dave Airlie	0560c82ff6	r600: cleanup whitespace in evergreen_compute.c This aligns the code with the style of the rest of the driver. Makes editing it a lot less painful. Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:38:51 +01:00
Edward O'Callaghan	6fc3e7c988	GL3.txt: Mark ARB_framebuffer_no_attachments as done Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:59 +10:00
Edward O'Callaghan	ea310f2b38	r600g: Enable ARB_framebuffer_no_attachments Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:59 +10:00
Edward O'Callaghan	483a686f80	radeonsi: Enable ARB_framebuffer_no_attachments Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:59 +10:00
Edward O'Callaghan	1156cad405	radeonsi: Improve assert info out of si_set_framebuffer_state() Lets give the developer a little hand if we are going to assert on a zero literal at the end of a branch. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	bb1bd0ddd7	radeonsi: Allow 16 samples MSAA mode for PIPE_FORMAT_NONE For ARB_framebuffer_no_attachment; A is_format_supported() query with 'PIPE_FORMAT_NONE' passed implies a query of the number of samples supported from the framebuffer with no attachment. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	63f2b2f2c0	softpipe: Set samples and layers in set_framebuffer_state() cb Carries across the number of samples and layers state in the 'softpipe_set_framebuffer_state()' callback. This state is part of 'ARB_framebuffer_no_attachments' support. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	c6a514d7df	mesa/st: Update framebuffer state with no.of samples,layers Handle the case of ARB_framebuffer_no_attachment. Also, kill off a dead debug printf() call while we are here. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	7ff28d2af0	gallium/trace: Dump no.of samples and layers in fb state Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	0b7075fed7	gallium: Put no.of {samples,layers} into pipe_framebuffer_state Here we store the number of samples and layers directly in the pipe_framebuffer_state so that in the case of ARB_framebuffer_no_attachment we may make use of them directly. Further, we adjust various gallium/auxiliary helper functions accordingly. V2: Convert branches in util_framebuffer_get_num_layers() and util_framebuffer_get_num_samples() to their canonical form. V3: 'git stash pop' the typo fix of 'cbufs' which should be 'nr_cbufs' that was missing in V2, woops! Thanks Marek for pointing this out yet again. V4: Squash in the following patch: 'gallium/util: Ensure util_framebuffer_get_num_samples() is valid' Upon context creation, internal driver structures are malloc()'ed and memset() to zero them. This results in a invalid number of samples 'by default'. Handle this in the simplest way to avoid elaborate and probably equally sub-optimial solutions. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	b512b5fd36	mesa/st: Set _NumSamples in update_framebuffer_state() Using PIPE_FORMAT_NONE to indicate what MSAA modes are supported with a framebuffer using no attachment. V.2: Rewrite MSAA mode loop to be more general. V.3: Move comment to right place after loop was rewritten. V.4: [airlied] remove unneeded variable, and assert, and unneeded pipe assignment Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-07 12:02:06 +10:00
Edward O'Callaghan	2016e9ffda	gallium: Obtain ARB_framebuffer_no_attachment constants Set default values for the constants required in ARB_framebuffer_no_attachments and obtained the number of layers from ``PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS``. We also obtain the MaxFramebufferSamples value using a query back to the driver for PIPE_FORMAT_NONE. V.1: Merge if branch predicates into one branch. Move const init into st_init_limits() [airlied: whitespace fixup] Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 11:56:44 +10:00
Edward O'Callaghan	4bc9130fba	gallium: Add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT Add PIPE_CAP to determine if the GL extension 'GL_ARB_framebuffer_no_attachments' shall be supported. The driver is required to support 'PIPE_FORMAT_NONE' via its 'is_format_supported()' callback in order to determine the MSAA modes the hardware supports so that values requested from the application using 'GL_ARB_framebuffer_no_attachments' may be quantized to what the hardware expects. V.2: Fix doc for a more detailed description of the PIPE_CAP and the corresponding GL constant. V.3: Renamed and repurposed once again. V.4: Remove CAP from cap_mapping array. [airlied: fix damaged whitespace] Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 11:56:44 +10:00
Edward O'Callaghan	85f79f0c75	mesa/st: Use _mesa_geometric_ functions appropriately Change references to gl_framebuffer::Width, Height, MaxNumLayers and Visual::samples to use the _mesa_geometric_ convenience functions for those places where the geometry of the gl_framebuffer is needed. This is in contrast to the geometry of the intersection of the attachments of the gl_framebuffer. This patch paves the way to enable GL_ARB_framebuffer_no_attachements for all gallium drivers. V.2: Remove itermeditate variable state. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 11:56:35 +10:00
Edward O'Callaghan	b40375a21c	mesa: Add comment to framebuffer_parameteri() V.2: Change 'N.B.,' to 'NOTE:'. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 11:55:33 +10:00
Jason Ekstrand	c62db279b6	i965/sf_state: Pull flat_enables out of prog_data Previously, we were walking over the shader source to figure out which inputs should be marked flat. Now, we can just pull it out of prog_data. This is needed for properly setting up 3DSTATE_SF/SBE for Vulkan and it also means that it will get properly cached. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-06 18:08:56 -07:00
Jason Ekstrand	e61cc87c75	i965/fs: Add a flat_inputs field to prog_data Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-06 18:08:56 -07:00
Jason Ekstrand	5c5a9b7bf6	brw/device_info: Add a helper for getting a device name This is needed by the Vulkan driver Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-06 18:08:56 -07:00
Jason Ekstrand	a241ab43b5	i965/fs_surface_builder: Mask signed integers after conversion Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-04-06 18:08:56 -07:00
Jason Ekstrand	3921b64e63	i965/fs: Make the repclear shader support either a uniform or a flat input In the Vulkan driver we use a single flat input instead of a uniform because setting up push constants is more disruptive to the pipeline than setting up another vertex input. This uses the number of uniforms as a key to keep it working for the GL driver. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-06 18:08:50 -07:00
Jason Ekstrand	061969f9dd	i965: Move get_hw_prim_for_gl_prim to brw_util.c It's used by brw_compile_gs in brw_vec4_gs_visitor.cpp so it needs to be in a file that's linked into libi965_compiler.la. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-06 18:08:47 -07:00
Bas Nieuwenhuizen	3393358115	radeonsi: set shader calling conventions Note that old mesa + new LLVM or new mesa + old LLVM breaks with this change and the corresponding LLVM change (D18559). For LLVM version <= 3.8 we use the old method, but we can't detect people using a post 3.8 svn version that is still too old. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-06 21:54:35 +02:00
Marek Olšák	0293d72fa5	drirc: add a workaround for blackness in Warsow Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>	2016-04-06 12:53:40 +02:00
Ilia Mirkin	2e123e1a25	glsl: use has_shader_storage_buffer_objects helper Replaces open-coded logic with existing helper. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-05 20:27:32 -04:00
Timothy Arceri	5d39f03806	glsl: remove remaining tabs in link_uniform_blocks.cpp Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-06 09:56:33 +10:00
Timothy Arceri	7ef57aa685	mesa: remove unused IsShaderStorage field Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-06 09:56:28 +10:00
Timothy Arceri	f1293b2f9b	glsl: fully split apart buffer block arrays With this change we create the UBO and SSBO arrays separately from the beginning rather than putting them into a combined array and splitting it apart later. A bug is with UBO and SSBO stage reference querying is also fixed as we now use the block index to lookup the references in the separate arrays not the combined buffer block array. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-06 09:56:24 +10:00
Rob Clark	506b561ba7	freedreno/ir3: insert extra move into phi We had an implicit assumption that the phi src was assigned in it's source (pred) block leading into the phi. But this is not true with NIR, so we can't just ignore the source block specified in the nir_phi_src. Insert an extra mov in the source block. If it is not required the CP pass will take it back out again. Fixes: ./tests/spec/glsl-1.10/execution/vs-call-in-nested-loop.shader_test ./tests/spec/glsl-1.10/execution/vs-inner-loop-modifies-outer-loop-var.shader_test and probably others. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-05 15:04:43 -04:00
Rob Clark	f9cdbf4405	freedreno/ir3: eliminate unnecessary absneg's The frontend inserts (abs) and (neg)'s to convert between NIR boolean (~0/0) and native boolean (1/0). So we'd end up with things like: cmps.s.ge r1.x, ... absneg.s r1.x, (neg)r1.x absneg.s r1.x, (abs)r1.x sel.b32 r2.x, r0.x, r1.x, r0.y The (neg) already gets collapsed due to the following (abs). Now by realizing that r1.x comes from a cmps.s instruction, we can drop the (abs) as well. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-05 15:04:25 -04:00
Michel Dänzer	0daab9878d	clover: Fix build against clang SVN >= r265359 Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-04-05 17:00:58 +00:00
Bas Nieuwenhuizen	799789ba99	radeonsi: use bounded indexing for samplers Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-05 19:19:18 +02:00
Bas Nieuwenhuizen	713353db18	radeonsi: use bounded indexing for constant buffers Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-05 19:19:07 +02:00
Marek Olšák	a64dbdf612	gallium/radeon: allow multiple exports of the same texture with different usage Instead of failing an assertion, disable DCC and CMASK on the first export that needs it, and merge the external usage flags. v2: clear the EXPLICIT_FLUSH flag if it's not set; whitespace fixes Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-04-05 15:32:40 +02:00
Marek Olšák	25f96d2b97	docs/relnotes: document EGL_KHR_reusable_sync	2016-04-05 15:32:40 +02:00
Dongwon Kim	70299474f5	egl: add EGL_KHR_reusable_sync to egl_dri This patch enables an EGL extension, EGL_KHR_reusable_sync. This new extension basically provides a way for multiple APIs or threads to be excuted synchronously via a "reusable sync" primitive shared by those threads/API calls. This was implemented based on the specification at https://www.khronos.org/registry/egl/extensions/KHR/EGL_KHR_reusable_sync.txt v2 - use thread functions defined in C11/threads.h instead of using direct pthread calls - make the timeout set with reference to CLOCK_MONOTONIC - cleaned up the way expiration time is calculated - (bug fix) in dri2_client_wait_sync, case EGL_SYNC_CL_EVENT_KHR has been added. - (bug fix) in dri2_destroy_sync, return from cond_broadcast call is now stored in 'err' intead of 'ret' to prevent 'ret' from being reset to 'EGL_FALSE' even in successful case - corrected minor syntax problems v3 - dri2_egl_unref_sync now became 'void' type. No more error check is needed for this function call as a result. - (bug fix) resolved issue with duplicated unlocking of display in eglClientWaitSync when type of sync is "EGL_KHR_REUSABLE_SYNC" Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-04-05 15:24:57 +02:00
Rob Clark	3e13572826	freedreno/ir3: deal with duplicate phi sources Otherwise we end up with funny things like: mov.f32f32 r0.x, r1.y mov.f32f32 r0.x, r1.y (It doesn't happen as much after fixing the problem w/ CP into phi src, but it can still happen since we aren't too clever about generating phi sources in the first place.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	f8feb97ba5	freedreno/ir3: fix silly brain-fart in RA We want to consider all the vars, not 1/32nd of them, when extending live-ranges. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	8e451c2d06	freedreno/ir3: don't cp into phi's The block defining a phi source might not have been executed. If we allow copy propagation, we could end up pointing to a src instruction in the wrong block. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	383b6e87f9	freedreno/ir3: we can't store immediate values Fixes some transform-feedback piglits, like: bin/ext_transform_feedback-nonflat-integral Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	d47fb856af	freedreno/ir3: add dumping for use/def/live-in/live-out Turned out to be useful to debug an issue in RA. Let's keep it. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	38ae05a340	freedreno/ir3: drop unused instr category arg No longer used, so drop the extra arg to ir3_instr_create() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	19739e4fb9	freedreno/ir3: remove ir3_instruction::category Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	70735643f4	freedreno/ir3: encode instruction category in opc_t Been on my TODO list for a while. If nothing else this will make gdb properly grok the opc_t enum. This first step preserves ir3_instruction::category (with an added assert that category matches what is encoded in opc_t). Next step is to drop the category field (and arg to ir3_instr_create()), but that is split into next commit for bisectability and so that we can run piglit in the intermediate state to flush out any problems. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Jason Ekstrand	5ea3647f89	i965/fs: Move the code for load/store_shared to emit_cs_intrinsic They are compute-shader only and that's where the code for doing atomics on shared variables lives so it seemes to make sense. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-04-04 15:56:50 -07:00
Jason Ekstrand	80c72a8ea7	i965/nir: Provide a default LOD for buffer textures Our hardware requires an LOD for all texelFetch commands even if they are on buffer textures. GLSL IR gives us an LOD of 0 in that case, but the LOD is really rather meaningless. This commit allows other NIR producers to be more lazy and not provide one at all. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-04-04 15:56:39 -07:00
Jason Ekstrand	e5c833db5a	i965/compiler: Remove a redundant declaration of brw_compiler_create	2016-04-04 14:51:35 -07:00
Kenneth Graunke	3babb7b0a4	nir: Use PRIi64 and PRIu64 instead of %ld and %lu. %ld and %lu aren't the right format specifiers for int64_t and uint64_t on 32-bit (x86) systems. They're %zu on Linux and %Iu on Windows. Use the standard C99 macros in hopes that they work everywhere. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-04 14:38:48 -07:00
Kenneth Graunke	da5d08707b	i965: Fix invalid pointer read in dead_control_flow_eliminate(). There may not be a previous block. In this case, there's no real work to do, so just continue on to the next one. v2: Update for bblock->prev() API change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-04 14:34:40 -07:00
Kenneth Graunke	9486614938	i965: Make bblock_t::next and friends return NULL at sentinels. The bblock_t::prev/prev_const/next/next_const API returns bblock_t pointers, rather than exec_nodes. So it's a bit surprising. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-04 14:34:16 -07:00
Kenneth Graunke	5509d43a11	glsl: Lower variable indexing of system value arrays unconditionally. lower_variable_index_to_cond_assign() did not handle system values. gl_SampleMaskIn[] is a system value, and also an array. Accessing it with a variable index would trigger an unreachable() assert. Rather than adding a new EmitNoIndirectSystemValues flag, we simply lower unconditionally. There is exactly one case where this occurs, and for all current drivers, lowering produces optimal code. Even for future drivers with 32x MSAA, it produces reasonable code. Fixes Piglit's new samplemaskin-indirect test. Also fixes many ES31-CTS tests when OES_sample_variables is enabled. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-04 14:29:21 -07:00
Jason Ekstrand	db35a851ad	i965/defines: Unconditionally define primitives	2016-04-04 14:25:36 -07:00
Jason Ekstrand	6a04968784	Merge remote-tracking branch 'public/master' into vulkan	2016-04-04 13:58:05 -07:00
Jason Ekstrand	88ef2476dc	i965/peephole_ffma: Only match a mul+add if none of the ops are exact Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-04 13:48:10 -07:00
Jason Ekstrand	eb93d6dec8	nir/search: Don't match inexact expressions with exact subexpressions In the first pass of implementing exact handling, I made a mistake with search-and-replace. In particular, we only reallly handled exact/inexact on the root of the tree. Instead, we need to check every node in the tree for an exact/inexact match. As an example of this, consider the following GLSL code precise float a = b + c; if (a < 0) { do_stuff(); } In that case, only the add will be declared "exact" and an expression that looks for "b + c < 0" will still match and replace it with "b < -c" which may yield different results. The solution is to simply bail if any of the values are exact when matching an inexact expression. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-04 13:48:10 -07:00
Jason Ekstrand	fe247bbe92	nir: Stop double-printing function arguments	2016-04-04 12:10:20 -07:00
Jason Ekstrand	cb317b8d07	glsl: Stop force-enabling compute shaders This isn't needed since we no longer use the GLSL compiler in Vulkan.	2016-04-04 12:09:12 -07:00
Jason Ekstrand	4d040a4ad3	glsl/standalone: Get rid of the unneeded _mesa_error_no_memory stub This hasn't been needed since we stopped using the GLSL compiler in the Vulkan driver and it was tripping up scons. Removing it fixes the scons build.	2016-04-04 12:07:51 -07:00
Kenneth Graunke	65fbc43d54	i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range. The SIN and COS instructions on Intel hardware can produce values slightly outside of the [-1.0, 1.0] range for a small set of values. Obviously, this can break everyone's expectations about trig functions. According to an internal presentation, the COS instruction can produce a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One suggested workaround is to multiply by 0.99997, scaling down the amplitude slightly. Apparently this also minimizes the error function, reducing the maximum error from 0.00006 to about 0.00003. When enabled, fixes 16 dEQP precision tests dEQP-GLES31.functional.shaders.builtin_functions.precision. {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. at the cost of making every sin and cos call more expensive (about twice the number of cycles on recent hardware). Enabling this option has been shown to reduce GPUTest Volplosion performance by about 10%. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-04 11:35:16 -07:00
Jason Ekstrand	8c8157bf6f	Remove more spirv2nir remnants	2016-04-04 11:24:48 -07:00
Kenneth Graunke	3aa51e02d6	i965: Allow 8x MSAA on >= 64bpp formats on Gen8+. See commit `3b0279a69` - this restriction is documented in the "Surface Format" field of RENDER_SURFACE_STATE. Looking at newer documentation, this restriction appears to exist on Haswell, but no longer applies on Gen8+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-04-04 10:41:29 -07:00
Brian Paul	1eeec7ec41	docs: remove stray 'TBD' in 11.2.0 relnotes file	2016-04-04 10:33:11 -06:00
Emil Velikov	35132c413c	docs: add news item and link release notes for 11.2.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-04 12:57:56 +01:00
Emil Velikov	dc4923d41f	docs: add sha256 checksums for 11.2.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `e7fb889dcc`)	2016-04-04 12:55:55 +01:00
Emil Velikov	7dc11ed0b2	docs: Update 11.2.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `ff9ddb9eb1`)	2016-04-04 12:55:54 +01:00
Dave Airlie	f9b8b48bed	mesa/get: fix MAX_GEOMETRY_SHADER_STORAGE_BLOCKS this was returning the fragment shader value. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-04 10:52:25 +01:00
Ilia Mirkin	4bc3b1ca48	nvc0: add hardware ETC2 and ASTC support on GK20A and GM107+ Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-04 00:32:48 -04:00
Ilia Mirkin	dab40d8083	docs: add note about GL_EXT_base_instance, sort entries Trivial. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-03 21:18:17 -04:00
Ilia Mirkin	d76e1cd2dd	mesa: expose EXT_base_instance in ES3 contexts This extension is identical to ARB_base_instance. Reuse the same entrypoints. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 20:40:55 -04:00
Ilia Mirkin	807e2c27ac	mesa: expose EXT_polygon_offset_clamp in ES contexts The extension spec was extended to also support ES. This functionality is provided all the way back to ES 1.0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 20:40:55 -04:00
Kenneth Graunke	40628886ca	glsl: Print "precise" on ir_variable nodes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-03 17:33:38 -07:00
Jose Fonseca	7ad49daca6	gallivm: Introduce lp_format_intrinsic. For adding .v4f32 like suffixes to intrinsics, taking special care for scalar case, which was being often neglected. This fixes invalid IR when doing mipmap filtering on SSE2 (the only case where we'd use intrinsics with scalars.) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-04 00:06:09 +01:00
Ilia Mirkin	7af12a8dc6	glsl: make sampler2DMSArray available in ESSL 3.20 Also avoid double-adding the sampler2DMS types when the array ext is enabled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:06:52 -04:00
Ilia Mirkin	aebb0e0186	glsl: make ssbo predicate return true when in a GLSL 430 or ESSL 310 shader I can't tell whether this actually matters, but we're creating function signatures with this predicate, so it should probably match when SSBO's are available. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:06:49 -04:00
Ilia Mirkin	87906cbc37	glsl: allow conservative depth qualifiers in GLSL 420 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:06:35 -04:00
Ilia Mirkin	d50ffb5e46	mesa: add always-false-for-now enables for GL 4.3, 4.4, 4.5. As the relevant extensions get implemented, the lines should be uncommented. I believe this is (almost) everything needed for those GL versions though. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:01:15 -04:00
Ilia Mirkin	9abbc49712	glsl: add ARB_ES3_1_compatibility support Oddly a bunch of the features it adds are actually from ESSL 3.20. But the spec is quite clear, oh well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:01:15 -04:00
Ilia Mirkin	1708e24f65	mesa: add ES3_1_compatibility extension enable Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:01:15 -04:00
Jose Fonseca	a293f57e13	gallivm: Use llvm.fabs. Exactly the same code. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 22:09:09 +01:00
Jose Fonseca	e4f01da15d	gallivm: Prefer backend agnostic intrinsic for rounding. We could unconditionally use these instrinsics, but performance with SSE2 would suck, as LLVM falls back to calling libm. lp_test_arit. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 22:09:07 +01:00
Jose Fonseca	324451e73f	gallivm: Add debug option to force SSE2. For simulating less capable machines. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 22:08:57 +01:00
Jose Fonseca	5fa31a4aba	llvmpipe: Test abs. Trivial.	2016-04-03 11:17:20 +01:00
Jose Fonseca	522ebe701d	llvmpipe: Build lp_test_arit on MSVC too. It builds fine now. Probably due to C99 support. Trivial.	2016-04-03 11:17:20 +01:00
Jose Fonseca	b284f1f7f9	gallivm: Fix performance regressions due to vector selects. LLVM often can't determine the mask elements are all ones/zeros, and there doesn't seem to be a good way to hint that. Thanks to Roland Scheidegger for spotting and analyzing the issue. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 09:51:27 +01:00
Jose Fonseca	11c4e5b45c	gallivm: Remove lp_build_load_volatile. No longer needed. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 09:51:27 +01:00
Jose Fonseca	bcfb86b09d	gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards. Only provide a fallback for LLVM 3.3. One less dependency on LLVM C++ interface. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 09:51:27 +01:00
Timothy Arceri	6d54096fa6	mesa: remove unrequired else The if always returns so no need for an else. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-03 09:55:19 +10:00
Ilia Mirkin	d64134ecae	gm107/ir: add OP_SELP emission, used in DSQRT lowering The current DSQRT lowering code emits an OP_SELP, so we have to handle its emission. This will eventually go away, but no harm supporting this op. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-02 19:27:51 -04:00
Ilia Mirkin	3610b1466d	nv50/ir: we can't load local memory directly into an output This fixes piglit tests like tests/spec/glsl-1.10/execution/variable-indexing/vs-output-array-float-index-wr.shader_test and related ones. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-02 18:10:20 -04:00
Christian Schmidbauer	2a529a8ac8	st/nine: specify WINAPI only for i386 and amd64 Currently mesa fails building with the x32 abi as ms_abi is not defined in such a case. The patch uses ms_abi only for amd64 targets and stdcall only for i386 targets to be sure that those are defined. This patch additionally checks for __GNUC__ to guarantee that __attribute__ is available. CC: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Christian Schmidbauer <ch.schmidbauer@gmail.com> Acked-by: Axel Davy <axel.davy@ens.fr>	2016-04-02 23:30:40 +02:00
Samuel Pitoiset	0852c5703b	nv50/ir: fix envyas variants when building the code lib nvc0 and nve4 have been respectively replaced by gf100 and gk104. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-02 20:00:57 +02:00
Brian Paul	36d8fed798	svga: remove unused svga_compile_key::texture_msaa field Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-02 08:05:20 -06:00
Brian Paul	b283c76342	svga: check TXF instruction's target to determine MSAA Rather than the currently bound texture. This goes along with the earlier patch to get away from examining bound textures and sampler views during shader translation. Fixes VMware bug 1632739. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-02 08:05:20 -06:00
Brian Paul	ef10b5427a	tgsi: add simple tgsi_is_msaa_target() helper Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-02 08:05:20 -06:00
Timothy Arceri	070e5a7405	glsl: rename var and simplify if is_ubo_var is true for both UBOs and SSBOs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	0fbd073dc2	glsl: store ubo or ssbo index in block index Previously we store the buffer block index i.e the index of a combined ubo/ssbo list. Fixes several dEQP-GLES31.functional tests: - program_interface_query.uniform.block_index.block_array - program_interface_query.uniform.block_index.named_block - program_interface_query.uniform.block_index.unnamed_block - program_interface_query.uniform.random.10 - program_interface_query.uniform.random.15 - program_interface_query.uniform.random.22 - program_interface_query.uniform.random.24 - program_interface_query.uniform.random.26 - program_interface_query.uniform.random.28 - program_interface_query.uniform.random.3 - program_interface_query.uniform.random.31 - program_interface_query.uniform.random.38 - program_interface_query.uniform.random.5 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94116 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	1265e1c4e1	glsl: store stage reference in gl_uniform_block This allows us to simplify the code and drop InterfaceBlockStageIndex which is a per stage array of integers the size of all blocks in the program combined including duplicates across stages. Adding a stage ref per block will use less memory. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	d8855d66f4	glsl: simplify buffer block resource limit checking This changes the code to use the buffer counts stored for each stage rather than counting from scratch. It also moves the checks outside of the for loop which means we now just get a single link error message if we go over the max rather than X error messages where X is the number we have exceeded the max by. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	0082b33a78	glsl: simplify SSBO resources check We already have a count of active SSBOs per stage so use it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	3e74bf5b9d	glsl: split buffer block arrays earlier This will allow us to use them when checking resources in a following patch and clean up a bunch of code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	0163881528	glsl: only set buffer block binding once during initialisation Since `8683d54d2b` there is now a single instance of the buffer block information that needs to be updated rather than one instance for each stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Kenneth Graunke	94ed482c19	glsl: Fix prorgram interface query locations biasing for SSO. With SSO, the GL_PROGRAM_INPUT and GL_PROGRAM_OUTPUT interfaces refer to the first and last shader stage linked into a program. This may not be the vertex and fragment shader stages. So, subtracting VERT_ATTRIB_GENERIC0 and FRAG_RESULT_DATA0 is bogus. We need to subtract VERT_ATTRIB_GENERIC0 for VS inputs, FRAG_RESULT_DATA0 for FS outputs, and VARYING_SLOT_VAR0 for other cases. Note that built-in variables get a location of -1. Fixes 4 dEQP-GLES31.functional.program_interface_query tests: - program_input.location.separable_fragment.var_explicit_location - program_input.location.separable_fragment.var_array_explicit_location - program_output.location.separable_vertex.var_array_explicit_location - program_output.location.separable_vertex.var_array_explicit_location Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 22:05:20 -07:00
Kenneth Graunke	c123294dfe	glsl: Return -1 for program interface query locations in many cases. We were recording locations for all variables, even ones without an explicit location set. Implement the rules from the spec, and record -1 in the resource list accordngly. Make program_resource_location stop doing math on negative values. Remove hacks that are no longer necessary now that we've stopped doing that. Fixes 4 dEQP-GLES31.functional.program_interface_query tests: - program_input.location.separable_fragment.var - program_input.location.separable_fragment.var_array - program_output.location.separable_vertex.var_array - program_output.location.separable_vertex.var_array v2: Delete more code Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 22:05:20 -07:00
Kenneth Graunke	9fe211bec4	glsl: Consolidate gl_VertexIDMESA -> gl_VertexID query hacks. A program will either have gl_VertexID or gl_VertexIDMESA (the lowered zero-based version), not both. Just spoof it in the resource list so the hacks are done in a single place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 22:05:20 -07:00
Kenneth Graunke	013f25c3b3	glsl: Clean up some leftover cruft. stages is always 1 << stage now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 22:05:20 -07:00
Kenneth Graunke	98c22c0403	glsl: Add all system variables to the input resource list. System values are just built-in input variables that we've opted to special-case out of convenience. We need to consider all inputs, regardless of how we've classified them. Unfortunately, there's one exception: we shouldn't add gl_BaseVertex unless ARB_shader_draw_parameters is enabled, because it doesn't actually exist in the language, and shouldn't be counted in the GL_ACTIVE_RESOURCES query. Fixes dEQP-GLES31.functional.program_interface_query.program_input. resource_list.compute.empty, which expects gl_NumWorkGroups to appear in the resource list. v2: Delete more code Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 22:05:18 -07:00
Kenneth Graunke	6e8b9d5bdd	glsl: Delete hack for VS system values. This makes no sense. If the stage being considered is the vertex shader, then we'll add inputs and system values appropriately. If we're not considering the vertex shader, then we absolutely should not do anything with it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 21:58:25 -07:00
Kenneth Graunke	47daf17da0	glsl: Make add_interface_variables only consider the appropriate stage. add_interface_variables() is supposed to add variables for the inputs of the first shader stage linked into a program, and the outputs of the last shader stage linked into a program. From the ARB_program_interface_query specification: "* PROGRAM_INPUT corresponds to the set of active input variables used by the first shader stage of <program>. If <program> includes multiple shader stages, input variables from any shader stage other than the first will not be enumerated. * PROGRAM_OUTPUT corresponds to the set of active output variables (section 2.14.11) used by the last shader stage of <program>. If <program> includes multiple shader stages, output variables from any shader stage other than the last will not be enumerated." Previously, we used build_stageref here, which walks over all linked shaders in the program. This meant that internal varyings would be visible. We don't actually need any of build_stageref's code: we already explicitly skip packed varyings, handle modes, and the name comparisons just do a fuzzy string comparison of name with itself. Fixes two tests: dEQP-GLES31.functional.program_interface_query. program_{input,output}.referenced_by.referenced_by_vertex_fragment. These tests have a VS and FS linked together into a single program. Both stages have an input called "shaderInput". But the FS input should not be visible because it isn't the first stage. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 21:58:25 -07:00
Kenneth Graunke	998ef1ad71	glsl: Clarify "mask" variable in add_interface_variables(). This is a bitfield of which stages refer to a variable. It is not used to mask off bits. In fact, it's used to contribute additional bits. Rename it and tidy a bit of the logic. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 21:58:25 -07:00
Kenneth Graunke	356c99b4e7	glsl: Pass stage to add_interface_variables(). add_interface_variables is supposed to add variables from either the first or last stage of a linked shader. But it has no way of knowing the stage it's being asked to process, which makes it impossible to produce correct stagerefs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 21:58:25 -07:00
Kenneth Graunke	2c5afe1fa9	glsl: Make vertex ID lowering declare gl_BaseVertex as hidden. If the GL_ARB_shader_draw_parameters extension is enabled, we'll already have a gl_BaseVertex variable. It will have var->how_declared set to ir_var_declared_implicitly, and will appear in the program resource list. If not, we make one for internal use. We don't want it to be listed in the program resource list, as the application won't be expecting it. Marking it hidden will properly exclude it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 21:58:22 -07:00
Kenneth Graunke	33df1c2935	glsl: Exclude ir_var_hidden variables from the program resource list. We occasionally generate variables internally that we want to exclude from the program resource list, as applications won't be expecting them to be present. The next patch will make use of this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 21:56:43 -07:00
Kenneth Graunke	15cd3ebede	mesa: Make _mesa_choose_tex_format() handle stencil textures. This is necessary for ARB_texture_stencil8 support on classic drivers. Presumably Gallium works because it implements its own ChooseTexFormat. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-01 19:04:28 -07:00
Jordan Justen	ef1b397b07	glsl: Don't require matching centroid qualifiers Note: This patch appears to violate older OpenGL and OpenGLES specs. The OpenGLES GLSL 3.1 and OpenGL GLSL 4.3 specifications both remove the requirement for the output and input centroid qualifiers to match. The deqp dEQP-GLES3.functional.shaders.linkage.varying.rules.differing_interpolation_2 test wants the newer OpenGLES 3.1 specification behavior, even for OpenGLES 3.0. This patch simply removes the checking in all cases. The OpenGLES 3.0 conformance test suite doesn't appear to require the older ("must match") spec behavior. For reference, here are the relavent spec citations: The OpenGL 4.2 spec says: "the last active shader stage output variables and fragment shader input variables of the same name must match in type and qualification (other than out matching to in)" The OpenGL 4.3 spec says: "interpolation qualification (e.g., flat) and auxiliary qualification (e.g. centroid) may differ." The OpenGLES GLSL 3.00.4 specification says: "The output of the vertex shader and the input of the fragment shader form an interface. For this interface, vertex shader output variables and fragment shader input variables of the same name must match in type and qualification (other than precision and out matching to in)." The OpenGLES GLSL 3.10 Specification says: "interpolation qualification (e.g., flat) and auxiliary qualification (e.g. centroid) may differ" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743 Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7819 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-01 18:06:19 -07:00
Bas Nieuwenhuizen	1a5c8c24b5	gallium: distinguish between shader IR in get_compute_param For radeonsi, native and TGSI use different compilers and this results in different limits for different IR's. The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE and MAX_THREADS_PER_BLOCK params, but I added a few others as shader related that seemed like they would also typically depend on the compiler. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:51:13 +02:00
Bas Nieuwenhuizen	be5899dcf9	gallium: add global buffer memory barrier bit Currently radeonsi synchronizes after every dispatch and Clover does nothing to synchronize. This is overzealous, especially with GL compute, so add a barrier for global buffers. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:51:06 +02:00
Bas Nieuwenhuizen	01f993a21f	gallium: add threads per block TGSI property The value 0 for unknown has been chosen to so that drivers using tgsi_scan_shader do not need to detect missing properties if they zero-initialize the struct. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:50:59 +02:00
Bas Nieuwenhuizen	ea8f4a6b13	gallium: add compute shader IR type Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:49:57 +02:00
Timothy Arceri	5ea825f556	glsl: remove tabs and fix some other style issues in glcpp-parse.y Note there are still tabs left in the parser rules. Acked-by: Dave Airlie <airlied@redhat.com>	2016-04-02 10:32:01 +11:00
Jason Ekstrand	cc1320220f	nir/gather_info: Add an assert for supported stages	2016-04-01 15:44:43 -07:00
Jason Ekstrand	ebb0bcc11d	nir: Move variable_get_io_mask back into gather_info It used to be in nir_gather_info.c until I moved it out to nir.h so it could be re-used with some linking code that never got merged. We'll move it back out if and when we have real code to share it with.	2016-04-01 15:39:48 -07:00
Jason Ekstrand	95106f6bfb	Merge remote-tracking branch 'public/master' into vulkan	2016-04-01 15:16:21 -07:00
Jason Ekstrand	14c46954c9	i965: Add an implemnetation of nir_op_fquantize2f16 Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-01 13:52:56 -07:00
Jason Ekstrand	de60e250f5	nir: Add an opcode for stomping a 32-bit value to 16-bit precision This correlates directly to the SPIR-V opcode OpQuantizeToF16 Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-01 13:52:28 -07:00
Samuel Pitoiset	60e1c6a7fc	nvc0: enable compute shaders on GK104 and GM107+ Compute support on GK110 is still unstable for weird reasons, but this can be fixed later as the NVF0_COMPUTE envvar prevent using compute. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	71f327aa21	nvc0: bump the maximum number of UBOs for compute on Kepler The maximum number of uniform blocks (MAX_COMPUTE_UNIFORM_BLOCKS) per compute program must be at least 12. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	839a469166	nvc0/ir: do not lower shared+atomics on GM107+ For Maxwell, the ATOMS instruction can be used to perform atomic operations on shared memory instead of this load/store lowering pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	543fb95473	nvc0/ir: add atomics support on shared memory for Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	275019d7db	nvc0/ir: fix wrong pred emission for ld lock on GK104 This fixes `84b9b8f` (nvc0/ir: add missing emission of locked load predicate). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	4f58b78c30	nvc0/ir: add support for compute UBOs on Kepler Make sure to avoid out of bounds access in presence of indirect array indexing by loading the size from the driver constant buffer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	3b246a71d7	nvc0: add indirect compute support on Kepler The grid size is stored as three 32-bits integers in the indirect buffer but the launch descriptor uses a 32-bits integer for both griddim_y and griddim_z like this (z << 16) \| y. To make it work, the 16 high bits of griddim_y are overwritten by griddim_z. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	7797d5f7d9	nvc0: reduce likelihood of collision for real buffers on Kepler Reduce likelihood of collision with real buffers by placing the hole at the top of the 4G area. This fixes some indirect draw+compute tests with large buffers. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	e2e8085fac	nvc0: store ubo info to the driver constbuf on Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	12aa047c98	nvc0: bind user uniforms for compute on Kepler Uniform buffer objects will be sticked to the driver constant buffer like buffers because the launch descriptor only allows 8 CBs. Input kernel parameters for OpenCL are still uploaded to screen->parm which is bound on c0, but this will be changed later with a new series. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	1828d90a00	nvc0: bind shader buffers for compute on Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	debd910512	nvc0: bind driver cb for compute on c7[] for Kepler Instead of using the screen->parm buffer object which will be removed, upload auxiliary constants to uniform_bo to be consistent regarding what we already do for Fermi. This breaks surfaces support (for compute only) but this will be properly re-introduced later for ARB_shader_image_load_store. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Jose Fonseca	f72de6f386	gallivm: Prevent disassembly debug output from being truncated. By using os_log_message directly, as _debug_vprintf truncates messages to 4K. Also cleanup the disassemble interface. Spotted by Roland. Trivial.	2016-04-01 21:22:42 +01:00
Rob Clark	972054f5bf	compiler: random comment fixup Just noticed this in passing.. gl_shader_stage already has tess so this comment no longer applies. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-01 12:34:40 -04:00
Brian Paul	58557b345c	docs: minor updates to license.html file Mesa demos are no longer part of the main Mesa tree/tarball. Add Gallium and GLX code to list of major components. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-01 09:50:08 -06:00
Mauro Rossi	e09d04cd56	radeonsi: use util_strchrnul() to fix android build error Android Bionic does not support strchrnul() string function, gallium auxiliary util/u_string.h provides util_strchrnul() This change avoids the following building error: external/mesa/src/gallium/drivers/radeonsi/si_shader.c:3863: error: undefined reference to 'strchrnul' collect2: error: ld returned 1 exit status Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:56:57 +01:00
Rob Herring	952720ccee	egl: android: enable EGL_FRAMEBUFFER_TARGET_ANDROID and EGL_RECORDABLE_ANDROID Set EGL_FRAMEBUFFER_TARGET_ANDROID and EGL_RECORDABLE_ANDROID config attributes to true for Android. These are required in Marshmallow. The implementation of EGL_RECORDABLE_ANDROID support has 2 options in the definition of the extension. Android implements the 2nd option which is the encoder must support RGB input. The requested input format is RGB888, so setting the attribute on all the native Android visual formats should be sufficient. Similarly, setting EGL_FRAMEBUFFER_TARGET_ANDROID for all configs with a EGL_NATIVE_VISUAL_ID should be sufficient. Most likely, the HWC should support the same set of formats the underlying DRM driver supports. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:45:13 +01:00
Rob Herring	e21e81aa18	egl: Add EGL_RECORDABLE_ANDROID attribute This is used by Android to select an eglconfig compatible with screen recording. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Herring <robh@kernel.org> [Emil Velikov: add the _eglIsConfigAttribValid check] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:45:08 +01:00
Rob Herring	8975527f58	egl: Add EGL_FRAMEBUFFER_TARGET_ANDROID attribute This is used by Android to select an eglconfig compatible with HWComposer. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Herring <robh@kernel.org> [Emil Velikov: add the _eglIsConfigAttribValid check] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:44:25 +01:00
Rob Herring	2d9e0f24e1	Android: fix x86 gallium builds Builds with gallium enabled fail on x86 with linker error: external/mesa3d/src/mesa/vbo/vbo_exec_array.c:127: error: undefined reference to '_mesa_uint_array_min_max' The problem is sse_minmax.c is not included in the libmesa_st_mesa library. Since the SSE4.1 files are needed for both libmesa_st_mesa and libmesa_dricore, move SSE4.1 files into a separate static library that can be used by both. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:44:22 +01:00
Jose Fonseca	cdf7c6b83d	gallivm: Use vector selects on LLVM 3.3+. This is an old patch I had around. Vector selects seem to work well from LLVM 3.3. Using them should improve code quality, as it might make constant propagation pass more effective. Tested lp_test_* Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-01 09:05:19 +01:00
Alejandro Piñeiro	cd7d631c71	glsl: do not raise unitialized variable warnings on builtins/reserved GL variables Needed because not all the built-in variables are marked as system values, so they still have the mode ir_var_auto. Right now it fixes raising the warning when gl_GlobalInvocationID and gl_LocalInvocationIndex are used. v2: use is_gl_identifier instead of filtering for some names (Ilia Mirkin) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-01 09:54:09 +02:00
Ilia Mirkin	df03be196a	nv50,nvc0: add PIPE_BIND_LINEAR support to is_format_supported vdpau has recently come to rely on this, so make sure to check it properly. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-31 21:53:11 -04:00
Ilia Mirkin	e0e1683087	mesa: add GL_OES/EXT_draw_buffers_indexed support This is the same ext as ARB_draw_buffers_blend (plus some core functionality that already exists). Add the alias entrypoints. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 21:12:49 -04:00
Kenneth Graunke	a57320a9ba	i965: Use brw->urb.min_vs_urb_entries instead of 32 for BLORP. Haswell GT2 and GT3 have a minimum of 64 entries. Hardcoding 32 is not legal. v2: Delete stale comment (caught by Alejandro). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-31 16:45:07 -07:00
Kenneth Graunke	58d4751fa0	i965: Fix textureSize() depth value for 1 layer surfaces on Gen4-6. According to the Sandybridge PRM's description of the resinfo message, the .z value returned will be Depth == 0 ? 0 : Depth + 1. The earlier PRMs have the same table. This means we return 0 for array textures with a single slice, when we ought to return 1. Just override it to max(depth, 1). Fixes 10 dEQP-GLES3.functional tests on Sandybridge: shaders.texture_functions.texturesize.sampler2darray_fixed_vertex shaders.texture_functions.texturesize.sampler2darray_fixed_fragment shaders.texture_functions.texturesize.sampler2darray_float_vertex shaders.texture_functions.texturesize.sampler2darray_float_fragment shaders.texture_functions.texturesize.isampler2darray_vertex shaders.texture_functions.texturesize.isampler2darray_fragment shaders.texture_functions.texturesize.usampler2darray_vertex shaders.texture_functions.texturesize.usampler2darray_fragment shaders.texture_functions.texturesize.sampler2darrayshadow_vertex shaders.texture_functions.texturesize.sampler2darrayshadow_fragment Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-31 15:23:49 -07:00
Ian Romanick	08ff5f4d1f	nir: Simplify a bcsel to logical-or Oddly, this did not affect the shader where I first noticed the pattern. That particular shader doesn't get its if-statement converted to a bcsel because there are two assignments in the else-statement. This led to me submitting https://bugs.freedesktop.org/show_bug.cgi?id=94747. shader-db results: Sandy Bridge total instructions in shared programs: 8467384 -> 8467069 (-0.00%) instructions in affected programs: 36594 -> 36279 (-0.86%) helped: 46 HURT: 0 total cycles in shared programs: 117573448 -> 117568518 (-0.00%) cycles in affected programs: 339114 -> 334184 (-1.45%) helped: 46 HURT: 0 Ivy Bridge / Haswell / Broadwell / Skylake: total instructions in shared programs: 7774258 -> 7773999 (-0.00%) instructions in affected programs: 30874 -> 30615 (-0.84%) helped: 46 HURT: 0 total cycles in shared programs: 65739190 -> 65734530 (-0.01%) cycles in affected programs: 180380 -> 175720 (-2.58%) helped: 45 HURT: 1 No change on G45 or Ironlake. I also tried these expressions, but none of them affected any shaders in shader-db: (('bcsel', a, 'a@bool', 'b@bool'), ('ior', a, b)), (('bcsel', a, 'b@bool', False), ('iand', a, b)), (('bcsel', a, 'b@bool', 'a@bool'), ('iand', a, b)), Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-31 14:59:36 -07:00
Ian Romanick	cdea12bf03	ptn: Fix all users of ptn_swizzle None of the callers actually wanted what it did. In ptn_xpd, you only ever want a vec3 swizzle. In ptn_tex, you want a swizzle that matches the number of required texture coordinates. shader-db results: G45: total instructions in shared programs: 4011240 -> 4010911 (-0.01%) instructions in affected programs: 59232 -> 58903 (-0.56%) helped: 114 HURT: 0 total cycles in shared programs: 84314194 -> 84313220 (-0.00%) cycles in affected programs: 779150 -> 778176 (-0.13%) helped: 110 HURT: 13 Ironlake: total instructions in shared programs: 6397262 -> 6396605 (-0.01%) instructions in affected programs: 117402 -> 116745 (-0.56%) helped: 227 HURT: 0 total cycles in shared programs: 128889798 -> 128888524 (-0.00%) cycles in affected programs: 1214644 -> 1213370 (-0.10%) helped: 179 HURT: 44 Sandy Bridge: total instructions in shared programs: 8467391 -> 8467384 (-0.00%) instructions in affected programs: 3107 -> 3100 (-0.23%) helped: 10 HURT: 6 total cycles in shared programs: 117580120 -> 117573448 (-0.01%) cycles in affected programs: 103158 -> 96486 (-6.47%) helped: 84 HURT: 11 Ivy Bridge: total instructions in shared programs: 7774255 -> 7774258 (0.00%) instructions in affected programs: 1677 -> 1680 (0.18%) helped: 8 HURT: 6 total cycles in shared programs: 65743828 -> 65739190 (-0.01%) cycles in affected programs: 89312 -> 84674 (-5.19%) helped: 78 HURT: 23 Haswell: total instructions in shared programs: 7107172 -> 7107150 (-0.00%) instructions in affected programs: 2048 -> 2026 (-1.07%) helped: 16 HURT: 0 total cycles in shared programs: 64653636 -> 64647486 (-0.01%) cycles in affected programs: 86836 -> 80686 (-7.08%) helped: 85 HURT: 17 Broadwell and Skylake: total instructions in shared programs: 8447529 -> 8447507 (-0.00%) instructions in affected programs: 2038 -> 2016 (-1.08%) helped: 16 HURT: 0 total cycles in shared programs: 66418670 -> 66413416 (-0.01%) cycles in affected programs: 90110 -> 84856 (-5.83%) helped: 83 HURT: 20 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-31 14:59:36 -07:00
Ian Romanick	8bb9c6ff7f	ptn: Silence unused parameter warning The KIL instruction doesn't have a destination, so ptn_kil never uses dest. program/prog_to_nir.c: In function ‘ptn_kil’: program/prog_to_nir.c:547:38: warning: unused parameter ‘dest’ [-Wunused-parameter] ptn_kil(nir_builder b, nir_alu_dest dest, nir_ssa_def *src) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-31 14:59:36 -07:00
Samuel Pitoiset	d22eca5f90	tgsi: silence compiler warning in fetch_sampler_unit() The unit variable can be used uninitialized. Fixes: `24e77cb09` ("tgsi: handle indirect sampler arrays. (v2)") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-01 07:16:24 +10:00
Samuel Pitoiset	05902a6686	tgsi: fix out of bounds access in exec_atomop() The number of channels must be 4 for all RGBA components. Fixes: `22d129601` ("tgsi: add support for image operations to tgsi_exec. (v2.1)") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-01 07:15:16 +10:00
Brian Paul	9076e04934	tgsi: split tgsi_util_get_texture_coord_dim() function into two It was kind of overloaded, returning two different things. Now get the index of the shadow reference src register with a new tgsi_util_get_shadow_ref_src_index() function. To verify the new code, I added some temp/debug code which looped over all TGSI_TEXTURE_x values, calling the old function and new and checking that the returned indexes matched. Also tested piglit "shadow" tests with softpipe/llvmpipe. No testing of ilo and radeonsi changes. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:48:00 -06:00
Brian Paul	9d7cd43988	tgsi: skip texture query opcodes when examining texture targets Should fix the assertion in piglit spec@arb_gpu_shader5@texturegather@fs-r-none-shadow-2d when the TXQ instruction specifies a 2D target but the sampler view was declared as SHADOW2D. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-31 09:47:40 -06:00
Pierre Moreau	f96a403bc3	nv50/ir: Check for valid insn instead of def size This fixes a null pointer dereference during the register allocation pass, if a function had arguments. Functions arguments get a definition from the function itself, a definition which is therefore not linked to any instruction. If a value ends up having a definition but no linked instruction, the register allocation pass doesn't need to consider whether that value is generated by an instruction that can only handle "short" registers (on nv50). Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-31 10:30:29 -04:00
Ilia Mirkin	a94d8d51d7	mesa: add GL_EXT_copy_image support The extension is identical to GL_OES_copy_image. But dEQP has tests that want the EXT variant. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	ebdb534548	mesa: add GL_OES_copy_image support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	571f538a62	mesa: remove duplicate MAX_GEOMETRY_SHADER_INVOCATIONS entry Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	2c7f5fe296	st/mesa: add ES sample-shading support We require the full ARB_gpu_shader5 for now, but in the future some other CAP could get exposed to indicate that only the multisample-related behavior of ARB_gpu_shader5 is available. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	3002296cb6	mesa: add GL_OES_shader_multisample_interpolation support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	411a88accc	mesa: add GL_OES_sample_shading support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	5283e81015	glsl: add GL_OES_sample_variables support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	6a8ca859f9	mesa: add OES_sample_variables to extension table, add enable bit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	903640c2ac	glsl: add gl_MaxSamples, new in GL 4.5 / GL ES 3.2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-30 22:57:17 -04:00
Matt Turner	4fea98991c	i965: Don't add barrier deps for FB write messages. Ken did this earlier, and this is just me reimplementing his patch a little differently. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	3495265158	i965: Add and use is_scheduling_barrier() function.	2016-03-30 19:54:30 -07:00
Matt Turner	b4e223cfbf	i965: Remove NOP insertion kludge in scheduler. Instead of removing every instruction in add_insts_from_block(), just move the instruction to its scheduled location. This is a step towards doing both bottom-up and top-down scheduling without conflicts. Note that this patch changes cycle counts for programs because it begins including control flow instructions in the estimates. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	a607f4aa57	i965: Assert that an instruction is not inserted around itself. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	7b208a7312	i965: Relax restriction on scheduling last instruction. I think when this code was written, basic blocks were always ended by a control flow instruction or an end-of-thread message. That's no longer the case, and removing this restriction actually helps things: instructions in affected programs: 7267 -> 7244 (-0.32%) helped: 4 total cycles in shared programs: 66559580 -> 66431900 (-0.19%) cycles in affected programs: 28310152 -> 28182472 (-0.45%) helped: 9577 HURT: 879 GAINED: 2 The addition of the is_control_flow() checks is not a functional change, since the add_insts_from_block() does not put them in the list of instructions to schedule. I plan to change this in a later patch. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	f60750968c	i965/vec4/tcs: Set conditional mod on TCS_OPCODE_SRC0_010_IS_ZERO. Missing this causes an assertion failure in the scheduler with the next patch. Additionally, this gives cmod propagation enough information to optimize code better. total instructions in shared programs: 7112991 -> 7112852 (-0.00%) instructions in affected programs: 25704 -> 25565 (-0.54%) helped: 139 total cycles in shared programs: 64812898 -> 64810674 (-0.00%) cycles in affected programs: 127224 -> 125000 (-1.75%) helped: 139 Acked-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	436bdd7403	Revert "i965: Don't add barrier deps for FB write messages." This reverts commit `d0e1d6b7e2`. The change in the vec4 code is a mistake -- there's never an FS_OPCODE_FB_WRITE in vec4 code. The change in the fs code had the (harmless) effect of not recognizing an FB_WRITE as a scheduling barrier even if it was marked EOT -- harmless because the scheduler marked the last instruction of a block as a barrier, something I'm changing in the following patches. This will be reimplemented later in the series.	2016-03-30 19:54:30 -07:00
Matt Turner	0d253ce34a	i965: Simplify full scheduling-barrier conditions. All of these were simply code for "architecture register file" (and in the case of destinations, "not the null register"). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	65bc94022b	i965: Remove incorrect cycle estimates. These printed the cycle count the last basic block (sched.time is set per basic block!). We have accurate, full program, data printed elsewhere. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:29 -07:00
Dave Airlie	10b189f985	st/mesa: fix fallout from xfb changes. Failed to update state tracker with new buffer interface. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:36:55 +10:00
Matt Turner	05ee6627d6	nir: Fix typo from commit `6702f1acde`.	2016-03-30 19:18:35 -07:00
Timothy Arceri	b273958c74	docs: mark xfb_* qualifiers as DONE Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:53:08 +11:00
Timothy Arceri	c5704bb350	mesa: add query support for GL_TRANSFORM_FEEDBACK_BUFFER interface Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:53:02 +11:00
Timothy Arceri	7234be0338	glsl: add transform feedback buffers to resource list Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:57 +11:00
Timothy Arceri	9e317271d7	mesa: add support to query GL_TRANSFORM_FEEDBACK_BUFFER_INDEX Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:47 +11:00
Timothy Arceri	51142e7705	mesa: add support to query GL_OFFSET for GL_TRANSFORM_FEEDBACK_VARYING Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:43 +11:00
Timothy Arceri	047139e8a0	mesa: rename tranform feeback varying macro XFB to XFV A latter patch will use XFB for buffers. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:39 +11:00
Timothy Arceri	b77c909878	glsl: always enable transform feedback mode when xfb_stride defined This enables in shader defined transform feedback mode even if the only place xfb_stride is defined is on the global out. We don't worry about xfb_buffer since Issue 22 c) in the spec says: "If the shader has an "xfb_buffer" qualifier identifying a buffer, but doesn't declare "xfb_offset" on anything associated with it, what happens? ... variables not qualified with "xfb_offset" are not captured, which makes the associated "xfb_buffer" qualifier irrelevant." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:34 +11:00
Timothy Arceri	c95e92b14d	glsl: handle varyings that are not written to but have an xfb_offset Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:29 +11:00
Timothy Arceri	d5c09d40b9	glsl: when lowering named interface set assigned flag This will be used when checking if xfb should attempt to capture a varying. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:22 +11:00
Timothy Arceri	a2fbc5ed44	glsl: reset current stream tracker When we move to the next buffer we need to reset the stream so that we don't generate an error message about streams not matching. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:17 +11:00
Timothy Arceri	f2a3c87a00	glsl: generate link error when implicit stride is to large This moves the check until after we have done the stride calculation and applies it to the xfb_* qualifiers. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:11 +11:00
Timothy Arceri	2fab85aaea	glsl: add xfb_stride link time validation From the ARB_enhanced_layous spec: "It is a compile-time or link-time error to have any xfb_offset that overflows xfb_stride, whether stated on declarations before or after the xfb_stride, or in different compilation units. ... When no xfb_stride is specified for a buffer, the stride of a buffer will be the smallest needed to hold the variable placed at the highest offset, including any required padding." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:05 +11:00
Timothy Arceri	8120e869b1	glsl: validate global out xfb_stride qualifiers and set stride on empty buffers Here we use the built-in validation in ast_layout_expression::process_qualifier_constant() to check for mismatching global out strides on buffers in a single shader. From the ARB_enhanced_layouts spec: "While xfb_stride can be declared multiple times for the same buffer, it is a compile-time or link-time error to have different values specified for the stride for the same buffer." For intrastage validation a new helper link_xfb_stride_layout_qualifiers() is created. We also take this opportunity to make sure stride is at least a multiple of 4, we will validate doubles at a later stage. From the ARB_enhanced_layouts spec: "If the buffer is capturing any double-typed outputs, the stride must be a multiple of 8, otherwise it must be a multiple of 4, or a compile-time or link-time error results." Finally we update store_tfeedback_info() to apply the strides to LinkedTransformFeedback and update the buffers bitmask to mark any global buffers with a stride as active. For example a shader with: layout (xfb_buffer = 0, xfb_offset = 0) out vec4 gs_fs; layout (xfb_buffer = 1, xfb_stride = 64) out; Is expected to have a buffer bound to both 0 and 1. From the ARB_enhanced_layouts spec: "A binding point requires a bound buffer object if and only if its associated stride in the program object used for transform feedback primitive capture is non-zero." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:00 +11:00
Timothy Arceri	cf039a309a	mesa: split transform feedback buffer into its own struct This will be used in a following patch to implement interface query support for TRANSFORM_FEEDBACK_BUFFER. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:52 +11:00
Timothy Arceri	258299d87a	glsl: use bitmask of active xfb buffer indices This allows us to print the correct binding point when not all buffers declared in the shader are bound. For example if we use a single buffer: layout(xfb_buffer=2, offset=0) out vec4 v; We now print '2' when the buffer is not bound rather than '0'. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:47 +11:00
Timothy Arceri	99cb5151ed	glsl: sort xfb varyings in offset/buffer order The existing transform feedback code expects to receive the list of varyings in increasing buffer order. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:38 +11:00
Timothy Arceri	0c66460fc6	glsl: basic linking support for xfb qualifiers This adds the initial infrastructure for enabling transform feedback mode via in shader qualifiers and adds initial buffer support. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:33 +11:00
Timothy Arceri	4305a60173	glsl: add xfb helpers and fields to the tfeedback_decl class We also apply any array/struct offsets. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:27 +11:00
Timothy Arceri	0822517936	glsl: add helper to process xfb qualifiers during linking This function checks for any xfb_* qualifiers which will enable transform feedback mode and cause any API defined xfb varyings to be ignored. It also counts the number of varyings that have a xfb_offset qualifier and finally it calls the create_xfb_varying_names() helper to generate the names of varyings to be caputured. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:21 +11:00
Timothy Arceri	707fd3972f	glsl: add helper to generate xfb varying names Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:17 +11:00
Timothy Arceri	8b6f8fe503	glsl: add helper for counting varyings This will be used to get a count of the number of varying name strings we are required to generate for use with the query api. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:06 +11:00
Timothy Arceri	ba7a7d4c39	glsl: add xfb qualifier lowering support for named blocks Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:01 +11:00
Timothy Arceri	4a873ef049	glsl: add xfb qualifiers to has_layout helper Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:54 +11:00
Timothy Arceri	598790e856	glsl: apply xfb_stride to implicit offsets for ifc block members When we have an interface block like: layout (xfb_buffer = 0, xfb_offset = 0) out Block { vec4 var1; layout (xfb_stride = 32) vec4 var2; vec4 var3; }; We take into account the stride of var2 when calculating the offset for var3. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:49 +11:00
Timothy Arceri	04a72e6e57	glsl: add xfb_stride compile time rules From the ARB_enhanced_layouts spec: "The xfb_stride qualifier specifies how many bytes are consumed by each captured vertex. It applies to the transform feedback buffer for that declaration, whether it is inherited or explicitly declared. It can be applied to variables, blocks, block members, or just the qualifier out. If the buffer is capturing any double-typed outputs, the stride must be a multiple of 8, otherwise it must be a multiple of 4, or a compile-time or link-time error results. ... The resulting stride (implicit or explicit) must be less than or equal to the implementation-dependent constant gl_MaxTransformFeedbackInterleavedComponents." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:44 +11:00
Timothy Arceri	edddad0eee	glsl: add xfb_offset compile time rules We also copy the qualifier values to the IR in this step. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:39 +11:00
Timothy Arceri	f6a8c7ef21	glsl: add xfb_buffer compile time rules Also copies the qualifier values to GLSL IR. From the ARB_enhanced_layouts spec: "The xfb_buffer qualifier can be applied to the qualifier out, to output variables, to output blocks, and to output block members. Shaders in the transform feedback capturing mode have an initial global default of layout(xfb_buffer = 0) out; This default can be changed by declaring a different buffer with xfb_buffer on the interface qualifier out. This is the only way the global default can be changed. When a variable or output block is declared without an xfb_buffer qualifier, it inherits the global default buffer. When a variable or output block is declared with an xfb_buffer qualifier, it has that declared buffer. All members of a block inherit the block's buffer. A member is allowed to declare an xfb_buffer, but it must match the buffer inherited from its block, or a compile-time error results. The xfb_buffer qualifier follows the same conventions, behavior, defaults, and inheritance rules as the qualifier stream, and the examples for stream apply here as well. This includes a block's inheritance of the current global default buffer, a block member's inheritance of the block's buffer, and the requirement that any xfb_buffer declared on a block member must match the buffer inherited from the block. ... It is a compile-time error to specify an xfb_buffer that is greater than the implementation-dependent constant gl_MaxTransformFeedbackBuffers." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:34 +11:00
Timothy Arceri	04d2f770c8	glsl: add field to track if xfb_buffer is an explicit or implicit value Since any of the xfb_* qualifiers trigger the shader to be in transform feedback mode we need an extra field to track if the xfb_buffer on interface members was set explicitly since xfb_buffer will always have a default value. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:29 +11:00
Timothy Arceri	733f1b2a55	glsl: add xfb_* qualifiers to glsl_struct_field These will be used to hold qualifier values for interface and struct members. Support is added to the struct/interface constructors to copy these fields upon creation. We also update record_compare() to ensure we don't reuse a glsl_type with the wrong xfb_* qualifier values. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:19 +11:00
Timothy Arceri	2dbcecb7a9	glsl: add IR fields for transform feedback layout qualifiers Adds xfb_buffer/stride fields and adds comment to offset field which is reused for xfb_offset. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:13 +11:00
Timothy Arceri	5c2516fc33	glsl: add validation for out layout qualifiers This adds validation for all qualifiers as allowed by the table in Section 4.4 (Layout Qualifiers) of the GLSL 4.5 spec. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:08 +11:00
Timothy Arceri	7b407fecec	glsl: relax stage restrictions on layout defaults for outputs The new xfb_buffer and xfb_stride global qualifiers are allowed in geom, tess and vertex stages. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:04 +11:00
Timothy Arceri	c9afd94af6	glsl: parse new transform feedback layout qualifiers We reuse the existing offset field for holding the xfb_offset expression but create a new flag as to avoid hitting the rules for the offset qualifier for UBOs. xfb_buffer qualifiers require extra processing when merging as they can be applied to global out defaults. We just apply the same rules as we do for the stream qualifier as the spec says: "The xfb_buffer qualifier follows the same conventions, behavior, defaults, and inheritance rules as the qualifier stream, and the examples for stream apply here as well." For xfb_stride we push everything into a global out field for later processing as xfb_stride applies to the entire buffer. We still need to have a separate field to store per variable strides because they can still effect implicit offsets e.g. when applied to block members with implicit offsets. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:00 +11:00
Timothy Arceri	13f6c788eb	glsl: move process_qualifier_constant() to ast_type.cpp We will make use of this function being here in the following patch. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:49:55 +11:00
Timothy Arceri	52caeee7e7	glsl: add transform feedback built-in constants These are new built-ins added by ARB_enhanced_layouts. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:49:51 +11:00
Timothy Arceri	8765a9e0fe	glsl: generate named interface block names correctly Firstly this updates the named interface lowering pass to store the interface without the arrays removed. Note we need to remove the arrays in the interface/varying matching code to not regress things but in future this should be fixed futher as it would seem we currently successfully match interface blocks with differnt array sizes. Since we now know if the interface was an array we can reduce the IR flags from_named_ifc_block_array and from_named_ifc_block_nonarray to just from_named_ifc_block. Next rather than having a different code path for named interface blocks in program_resource_visitor we just make use of the one used by UBOs this allows us to now handle arrays of arrays correctly. Finally we add a new param to the recursion function named_ifc_member this is because we only want to process a single member at a time. Note that this is also the glsl_struct_field from the original ifc type before lowering rather than the type from the lowered variable. This fixes a bug in Mesa where we would generate the names like WithInstArray[0].g[0][0] when it should be WithInstArray[0].g[0] for the following interface. out WithInstArray { float g[3]; } instArray[2]; Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:49:47 +11:00
Timothy Arceri	7ebc3deaad	glsl: Fix segfault when lhs is error_type in TCS It seems expected that both lhs and rhs could be of type error_type in this code however the TCS case wasn't expecting it. Fixes segfault in an enhanced layouts GL CTS test. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:49:42 +11:00
Dave Airlie	c9367c13ca	docs: update softpipe status for shader_image_load_store. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:14:30 +10:00
Dave Airlie	eb9ad9faa3	softpipe: add image support to softpipe (v3) This adds support for ARB_shader_image_load_store to softpipe. v2: add RESQ support (Ilia) v3: constify, cleanup internals, add some comments (Brian). Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:14:16 +10:00
Dave Airlie	0d1f679ded	draw: add support for passing images to vs/gs shaders. This just adds support for passing through images to the tgsi execution stage. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:14:11 +10:00
Dave Airlie	22d1296013	tgsi: add support for image operations to tgsi_exec. (v2.1) This adds support for load/store/atomic operations on images along with image tracking support. v2: add RESQ support. (Ilia) v2.1: constify interface (Brian) split get_image_coord_dim (Brian) Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:14:05 +10:00
Dave Airlie	493eab7679	softpipe: add support for explicit early depth testing ARB_shader_image_load_store adds support for explicit early depth testing. However we need to make sure we don't overwrite values using the shader written values in this case. This fixes early depth testing in softpipe to conform with those requirements. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:13:54 +10:00
Dave Airlie	827393b76f	tgsi: introduce NonHelperMask This is a mask of which of the current 2x2 grid are non-helper invocations. This allows us to mask off the helper invocations later for the image operations. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:13:50 +10:00
Dave Airlie	ca180c09bb	tgsi_exec: handle execmask when doing indirect lookups Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:13:46 +10:00
Dave Airlie	1ff4cc0535	tgsi_exec: add support for up to 3 address registers (v2) v2: be consistent with other definitions. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:13:08 +10:00
Matt Turner	6702f1acde	nir: Propagate negates up multiplication chains. total instructions in shared programs: 7112159 -> 7088092 (-0.34%) instructions in affected programs: 1374915 -> 1350848 (-1.75%) helped: 7392 HURT: 621 GAINED: 2 LOST: 2	2016-03-30 13:12:34 -07:00
Matt Turner	a74fc3fe8a	i965: Don't inline intel_batchbuffer_require_space(). It's called by the inline intel_batchbuffer_begin() function which itself is used in BEGIN_BATCH. So in sequence of code emitting multiple packets, we have inlined this ~200 byte function multiple times. Making it an out-of-line function presumably improved icache usage. Improves performance of Gl32Batch7 by 3.39898% +/- 0.358674% (n=155) on Ivybridge. Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2016-03-30 13:12:34 -07:00
Christian König	1faca438bd	r600: ignore PIPE_BIND_LINEAR in *_is_format_supported Similar to radeonsi linear layout should work for all not compressed or depth/stencil formats. Fixes issues with VDPAU on r600. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-03-30 20:00:27 +02:00
Thomas Hindoe Paaboel Andersen	9a73f5728e	st/vdpau: correct null check The null check of result was the wrong way around. Also, move memset and dereference of result after the null check. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-03-30 20:00:27 +02:00
Brian Paul	4541a78502	docs: remove docs/COPYING which contains GPL license There hasn't been GPL code in Mesa for a long time now. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-30 11:38:51 -06:00
Samuel Pitoiset	bb37886f75	glsl: add missing types for buffer images Type of GLSL_SAMPLER_DIM_BUF can be sampler or image. Spotted while trying to run dEQP tests related to ARB_shader_image_load_store. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-30 19:01:33 +02:00
Lars Hamre	6773128bbf	glsl: invalidate float suffixes for GLSL 1.10 and GLSL ES 1.00 Float suffixes are not allowed in GLSL 1.10 nor GLSL ES 1.00. Fixes the following piglit tests: tests/spec/glsl-1.10/compiler/literals/invalid-float-suffix-capital-f.vert tests/spec/glsl-1.10/compiler/literals/invalid-float-suffix-f.vert` v2: modify error message v3: parse the float instead of returning an ERROR_TOK v4: (by Ken) Change to is_version(120, 300) to avoid breaking ES3 shaders; update commit message accordingly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81585 Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-29 21:26:34 -07:00
Jason Ekstrand	cf2257069c	nir/spirv: Set a default number of invocations for geometry shaders The SPIR-V spec says geometry shaders are supposed to have one invocation by default. The execution mode is only required if there are multiple invocations.	2016-03-29 20:30:27 -07:00
Roland Scheidegger	2d3b8aefda	tgsi: (trivial) only verify target for is_tex instructions d3d10 state tracker does not encode (valid) target (only offsets are really used from the texture bits), since that information always comes from the sview dcl, and not the instruction (note the meaning of target is actually slightly different between gl and d3d10 in any case, because d3d10 target does never include shadow bit). Also move the msaa sampler identification as well - would need to set that on the sview not sampler, so while this does not fix it make it at least obvious it won't work with sample instructions.	2016-03-30 04:26:54 +02:00
Ilia Mirkin	553e37aa33	mesa: allow mutable buffer textures to back GL ES images Since there is no way to create immutable texture buffers in GL ES, mutable buffer textures are allowed to back images. See issue 7 of the GL_OES_texture_buffer specification. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-29 21:41:03 -04:00
Brian Paul	513384d7e8	mesa: make _mesa_prepare_mipmap_level() static No longer called from any other file. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-29 18:13:46 -06:00
Brian Paul	ed39de90f1	meta: use _mesa_prepare_mipmap_levels() The prepare_mipmap_level() wrapper for _mesa_prepare_mipmap_level() is not needed. It only served to undo the GL_TEXTURE_1D_ARRAY height/depth change was was made before the call to prepare_mipmap_level() Said another way, regardless of how the meta code manipulates the height/ depth dims for GL_TEXTURE_1D_ARRAY, the gl_texture_image dimensions are correctly set up by _mesa_prepare_mipmap_levels(). Tested by plugging _mesa_meta_GenerateMipmap() into the swrast driver and testing with piglit. v2 (idr): Early out of the mipmap generation loop with dstImage is NULL. This can occur for immutable textures that have a limited range of levels or in the presense of memory allocation failures. Fixes arb_texture_view-mipgen on Intel platforms. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-29 18:13:46 -06:00
Brian Paul	bab0752a80	docs: add HTTP link for Mesa downloads Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92628 Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-29 18:13:46 -06:00
Brian Paul	5c85c3be26	tgsi: simplify tgsi_shader_info::is_msaa_sampler checking We assert that fullinst->Instruction.Texture != 0 above so no need to check it in the conditional. We also have the fullinst->Texture.Texture value in a local variable, so use it. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-03-29 18:13:46 -06:00
Brian Paul	86e1768c13	tgsi: collect texture sampler target info in tgsi_scan_shader() Texture sample instructions specify a sampler unit and texture target such as "1D", "2D", "CUBE", etc. Sampler view declarations also specify the sampler unit and texture target. This patch checks that the texture instructions agree with the declarations and collects the texture target type for each sampler unit. v2: only compare instruction's texture target to the sampler view declaration target if the instruction is a TEX instruction, not a SAMPLE instruction. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-03-29 18:13:46 -06:00
Brian Paul	6775268b61	gallium/docs: s/gven/given/	2016-03-29 18:13:46 -06:00
Brian Paul	75b713455c	xlib: add support for GLX_ARB_create_context This adds the glXCreateContextAttribsARB() function for the xlib/swrast driver. This allows more piglit tests to run with this driver. For example, without this patch we get: $ bin/fbo-generatemipmap-1d -auto piglit: error: waffle_config_choose failed due to WAFFLE_ERROR_UNSUPPORTED_ ON_PLATFORM: GLX_ARB_create_context is required in order to request an OpenGL version not equal to the default value 1.0 piglit: error: Failed to create waffle_config for OpenGL 2.0 Compatibility Context piglit: info: Failed to create any GL context PIGLIT: {"result": "skip" } Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-03-29 18:13:45 -06:00
Brian Paul	d8d029f22b	st/mesa: simplify st_generate_mipmap() The whole st_generate_mipmap() function was overly complicated. Now we just call the new _mesa_prepare_mipmap_levels() function to prepare the texture mipmap memory, then call the generate function which fills in the texture images. This fixes a failed assertion in llvmpipe/softpipe which is hit with the new piglit generatemipmap-base-change test. Also fixes some device errors (format mismatches) with the VMware svga driver. v2: fix a comment typo, per Sinclair Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-03-29 18:13:45 -06:00
Brian Paul	105fe52784	mesa: new _mesa_prepare_mipmap_levels() function for mipmap generation Simplifies the loops in generate_mipmap_uncompressed() and generate_mipmap_compressed(). Will be used in the state tracker too. Could probably be used in the meta code. If so, some additional clean-ups can be done after that. v2: use unsigned types instead of GLuint, per Ian Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-29 18:13:45 -06:00
Kenneth Graunke	d4a5a61d44	i965: Don't use CUBE wrap modes for integer formats on IVB/BYT. There is no linear filtering for integer formats, so we should always be using CLAMP_TO_EDGE mode. Fixes 46 dEQP cases on Ivybridge (which were likely broken by commit `0faf26e6a0`). This workaround doesn't appear to be necessary on any other hardware; I haven't found any documentation mentioning errata in this area. v2: Only apply on Ivybridge/Baytrail to avoid regressing GLES3.1 tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1]	2016-03-29 15:43:18 -07:00
Kenneth Graunke	f8c69fbb54	Revert "i965: Set address rounding bits for GL_NEAREST filtering as well." This reverts commit `60d6a8989a`. It's pretty sketchy, and apparently regressed a bunch of dEQP tests on Sandybridge.	2016-03-29 15:35:07 -07:00
Rovanion Luckey	7087e0ab27	gallium: Format code in pb_buffer_fenced.c according to style guide. This is a tiny housekeeping patch which does the following: * Replaced tabs with three spaces. * Formatted oneline and multiline code comments. Some doxygen comments weren't marked as such and some code comments were marked as doxygen comments. * Spaces between if- and while-statements and their parenthesis. According to the mesa coding style guidelines. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-29 13:44:11 -06:00
Charmaine Lee	2d8df0306b	svga: emit sampler declarations in the helper function for non vgpu10 With commit `dc9ecf58c0`, we are now getting the sampler target from the sampler view declaration. But since a sampler view declaration can be defined after a sampler declaration, we need to emit the sampler declarations in the pre-helpers function, otherwise, the sampler target might not have defined yet for the sampler declaration. Fixes viewperf maya-03 and various gl trace regressions in hwv11. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-29 13:35:09 -06:00
Brian Paul	96e0894106	svga: avoid freeing non-malloced memory svga_shader_expand() will fall back to using non-malloced memory for emit.buf if malloc fails. We should check if the memory is malloced before freeing it in the error path of svga_tgsi_vgpu9_translate. Original patch by Thomas Hindoe Paaboel Andersen <phomes@gmail.com>. Remove trivial svga_destroy_shader_emitter() function, by BrianP. Signed-off-by: Brian Paul <brianp@vmware.com>	2016-03-29 13:35:08 -06:00
Samuel Pitoiset	9d57c84994	nvc0/ir: move load/store lowering pass to handleLDST() Having all this code in a big switch is not really a good pratice. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-29 19:55:51 +02:00
Christian König	cc68dc2b5e	st/mesa: implement new DMA-buf based VDPAU interop v2 Avoid using internal structures from another API. v2: rebase and moved includes so they don't cause problem when VDPAU isn't installed. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-29 17:29:22 +02:00
Christian König	bdeb22b7b6	st/vdpau: implement the new DMA-buf based interop v2 That should allow us to get away from passing internal structures around. v2: rebased Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-29 17:29:18 +02:00
Christian König	0042aa508e	st/vdpau: move FormatRGBAToPipe into the interop We are going to need that in the Mesa state tracker as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-29 17:29:14 +02:00
Christian König	faba96bc60	st/vdpau: add new interop interface Use DMA-buf for the VDPAU interop interface instead of using internal structures. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-29 17:29:10 +02:00
Christian König	d180de3532	st/vdpau: use linear layout for output surfaces Works around a bug in radeonsi and tiling is actually not very beneficial in this use case. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-29 17:28:43 +02:00
Christian König	7eb5e5b8b4	radeonsi: ignore PIPE_BIND_LINEAR in si_is_format_supported v2 Linear layout should work for all not compressed or depth/stencil formats. v2: restrict it a bit more Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-29 17:28:35 +02:00
Ilia Mirkin	9286cbdd1e	st/mesa: enable OES_texture_buffer when all components available OES_texture_buffer combines bits from a number of desktop extensions. When they're all available, turn it on. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-29 10:15:21 -04:00
Adam Jackson	5e1aec6db0	glapi/glx: Mark the indirect swapped dispatch functions _X_COLD A modest size savings: text data bss dec hex filename 264143 15608 232 279983 445af libglx.so.before 254303 15608 232 270143 41f3f libglx.so.after Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-29 10:10:57 -04:00
Adam Jackson	ea0f62e45e	glapi/glx: Sync some additional error checking from xserver Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-29 10:10:57 -04:00
Jordan Justen	f56f538ce4	anv/gen7: Fix command parser version test with indirect dispatch Caught-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 22:30:33 -07:00
Alejandro Piñeiro	dcd41ca87a	glsl: raise warning when using uninitialized variables v2: * Take into account out varyings too (Timothy Arceri) * Fix style (Timothy Arceri) * Use a new ast_expression variable, instead of an ast_expression::hir new parameter (Timothy Arceri) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94129 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-29 07:28:57 +02:00
Alejandro Piñeiro	8568d02498	glsl: add is_lhs bool on ast_expression Useful to know if a expression is the recipient of an assignment or not, that would be used to (for example) raise warnings of "use of uninitialized variable" without getting a false positive when assigning first a variable. By default the value is false, and it is assigned to true on the following cases: * The lhs assignments subexpression * At ast_array_index, on the array itself. * While handling the method on an array, to avoid the warning calling array.length * When computed the cached test expression at test_to_hir, to avoid a duplicate warning on the test expression of a switch. set_is_lhs setter is added, because in some cases (like ast_field_selection) the value need to be propagated on the expression tree. To avoid doing the propatagion if not needed, it skips if no primary_expression.identifier is available. v2: use a new bool on ast_expression, instead of a new parameter on ast_expression::hir (Timothy Arceri) v3: fix style and some typos on comments, initialize is_lhs default value on constructor, to avoid a c++11 feature (Ian Romanick) v4: some tweaks on comments (Timothy Arceri) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94129 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-29 07:28:57 +02:00
Jason Ekstrand	35e2e96b30	nir: Add a helper for getting the current block from a cursor Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	be98c47528	nir/lower_out_to_temp: Add an "entrypoint" parameter Previously, the pass assumed that the entrypoint would be whatever function happened to have the name "main". We really shouldn't trust in the function names. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	31a5bec93f	nir/lower_out_to_temp: Steal the output's constant initializer Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	38de85f9a5	nir: Add a helper for getting the unique function in a shader Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	49be812be6	nir/sweep: Sweep function parameters They are no longer in the list of local variables so we need to explicitly sweep them. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	1be4c61c95	nir/builder: Add a helper for creating undefs Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	6a2479d618	nir/builder: Add a helper for storing to variable derefs Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	77e2ac1da7	nir/builder: Add a helper for building fdot instructions Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	da422663a6	nir: Add a variable_foreach_safe helper Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	731870fbe3	nir/Makefile: Fix alphabetization Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Ilia Mirkin	b4c0c514b1	mesa: add OES_texture_buffer and EXT_texture_buffer support Allow ES 3.1 contexts to access the texture buffer functionality. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-28 20:29:29 -04:00
Ilia Mirkin	720670a615	glsl: add OES_texture_buffer and EXT_texture_buffer support Expose the samplerBuffer/imageBuffer types, and allow the various functions to operate on them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-28 20:20:49 -04:00
Ilia Mirkin	74b76c08a3	mesa: add OES_texture_buffer and EXT_texture_buffer extension to table We need to add a new bit since the GL ES exts require functionality from a combination of texture buffer extensions as well as images (for imageBuffer) support. Additionally, not all GPUs support all the texture buffer functionality (e.g. rgb32 isn't supported by nv50). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-28 20:19:14 -04:00
Ilia Mirkin	659beca666	mesa: properly return GetTexLevelParameter queries for buffer textures This fixes all failures with dEQP tests in this area. While ARB_texture_buffer_object explicitly says that GetTexLevelParameter & co should not be supported, GL 3.1 reverses this decision and allows all of these queries there. Conversely, there is no text that forbids the buffer-specific queries from being used with non-buffer images. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-28 20:18:46 -04:00
Kenneth Graunke	4ed4a2af86	glsl: Delete initialized field from uniform storage test. Timothy deleted this field. Fixes "make check". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-28 17:02:00 -07:00
Jordan Justen	8dbfa265a4	anv/gen7: DispatchIndirect requires cmd parser 5 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 17:01:35 -07:00
Jordan Justen	1a3adae84a	anv/gen7: Save kernel command parser version Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 17:01:35 -07:00
Jordan Justen	f60683b32a	anv: Invalidate state cache before L3 partitioning set-up. Port `10d84ba9f0` to anv. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 17:01:35 -07:00
Jordan Justen	5879cb0251	anv: Fix cache pollution race during L3 partitioning set-up. Port `0aa4f99f56` to anv. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 17:01:35 -07:00
Timothy Arceri	86d87d1047	mesa: remove initialized field from uniform storage The only place this was used was in a gallium debug function that had to be manually enabled. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-29 09:59:03 +11:00
Samuel Pitoiset	b8b3af2932	nvc0: use a different offset for buffers and surfaces To not overwrite buffers and surfaces information, we need to use a different offset in the driver constant buffer. Currently, OP_SUQ is only supported for buffers but this will be slightly updated for images support. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-29 00:47:28 +02:00
Kenneth Graunke	60d6a8989a	i965: Set address rounding bits for GL_NEAREST filtering as well. Yuanhan Liu decided these were useful for linear filtering in commit `76669381` (circa 2011). Prior to that, we never set them; it seems he tried to preserve that behavior for nearest filtering. It turns out they're useful for nearest filtering, too: setting these fixes the following dEQP-GLES3 tests: functional.fbo.blit.rect.nearest_consistency_mag functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_y functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_min functional.fbo.blit.rect.nearest_consistency_min_reverse_src_x functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_y Apparently, BLORP has always set these bits unconditionally. However, setting them unconditionally appears to regress tests using texture projection, 3D samplers, integer formats, and vertex shaders, all in combination, such as: functional.shaders.texture_functions.textureprojlod.isampler3d_vertex Setting them on Gen4-5 appears to regress Piglit's tests/spec/arb_sampler_objects/framebufferblit. Honestly, it looks like the real problem here is a lack of precision. I'm just hacking around problems here (as embarassing as it is). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 15:28:58 -07:00
Kenneth Graunke	0faf26e6a0	i965: Always use BRW_TEXCOORDMODE_CUBE when seamless filtering. When using seamless cube map mode and NEAREST filtering, we explicitly overrode the wrap modes to CLAMP_TO_EDGE. This was to implement the following spec text: "If NEAREST filtering is done within a miplevel, always apply apply wrap mode CLAMP_TO_EDGE." However, textureGather() ignores the sampler's filtering mode, and instead returns the four pixels that would be blended by LINEAR filtering. This implies that we should do proper seamless filtering, and include pixels from adjacent cube faces. It turns out that we can simply delete the NEAREST -> CLAMP_TO_EDGE overrides. Normal cube map sampling works by first selecting the face, and then nearest filtering fetches the closest texel. If the nearest texel was on a different face, then that face would have been chosen. So it should always be within the face anyway, which effectively performs CLAMP_TO_EDGE. Fixes 86 dEQP-GLES31.texture.gather.basic.cube.* tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Suggested-by: Ian Romanick <idr@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-28 15:25:04 -07:00
Kenneth Graunke	72473658c5	i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs. Our driver uses the brw_render_cache mechanism to track buffers we've rendered to and are about to sample from. Previously, we did a single PIPE_CONTROL with the following bits set: - Render Target Flush - Depth Cache Flush - Texture Cache Invalidate - VF Cache Invalidate - Instruction Cache Invalidate - CS Stall This combined both "top of pipe" invalidations and "bottom of pipe" flushes, which isn't how the hardware is intended to be programmed. The "top of pipe" invalidations may happen right away, without any guarantees that rendering using those caches has completed. That rendering may continue altering the caches. The "bottom of pipe" flushes do wait for the rendering to complete. The CS stall also prevents further work from happening until data is flushed out. What we wanted to do was wait for rendering complete, flush the new data out of the render and depth caches, wait, then invalidate any stale data in read-only caches. We can accomplish this by doing the "bottom of pipe" flushes with a CS stall, then the "top of pipe" flushes as a second PIPE_CONTROL. The flushes will wait until the rendering is complete, and the CS stall will prevent the second PIPE_CONTROL with the invalidations from executing until the first is done. Fixes dEQP-GLES3.functional.texture.specification.teximage2d_pbo subtests on Braswell and Skylake. These tests hit the meta PBO texture upload path, which binds the PBO as a texture and samples from it, while rendering to the destination texture. The tests then sample from the texture. For now, we leave Gen4-5 alone. It probably needs work too, but apparently it hasn't even been setting the (G45+) TC invalidation bit at all... v2: Add Sandybridge post-sync non-zero workaround, for safety. Cc: mesa-stable@lists.freedesktop.org Suggested-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-28 15:23:56 -07:00
Kenneth Graunke	de505f7d7b	i965: Whack UAV bit when FS discards and there are no color writes. dEQP-GLES31.functional.fbo.no_attachments.* draws a quad with no framebuffer attachments, using a shader that discards based on gl_FragCoord. It uses occlusion queries to inspect whether pixels are rendered or not. Unfortunately, the hardware is not dispatching any pixel shaders, so discards never happen, and the full quad of pixels increments PS_DEPTH_COUNT, making the occlusion query results bogus. To understand why, we have to delve into the WM_INT internal signalling mechanism's formulas. The "WM_INT::Pixel Shader Kill Pixel" signal is defined as: 3DSTATE_WM::ForceKillPixel == ON \|\| (3DSTATE_WM::ForceKillPixel != Off && !WM_INT::WM_HZ_OP && 3DSTATE_WM::EDSC_Mode != PREPS && (WM_INT::Depth Write Enable \|\| WM_INT::Stencil Write Enable) && ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (3DSTATE_PS_EXTRA::PixelShaderKillsPixels \|\| 3DSTATE_PS_EXTRA:: oMask Present to RenderTarget \|\| 3DSTATE_PS_BLEND::AlphaToCoverageEnable \|\| 3DSTATE_PS_BLEND::AlphaTestEnable \|\| 3DSTATE_WM_CHROMAKEY::ChromaKeyKillEnable)) Because there is no depth or stencil buffer, writes to those buffers are disabled. So the highlighted condition is false, making the whole "Kill Pixel" condition false. This then feeds into the following "WM_INT::ThreadDispatchEnable" condition: 3DSTATE_WM::ForceThreadDispatch != OFF && !WM_INT::WM_HZ_OP && 3DSTATE_PS_EXTRA::PixelShaderValid && (3DSTATE_PS_EXTRA::PixelShaderHasUAV \|\| WM_INT::Pixel Shader Kill Pixel \|\| WM_INT::RTIndependentRasterizationEnable \|\| (!3DSTATE_PS_EXTRA::PixelShaderDoesNotWriteRT && 3DSTATE_PS_BLEND::HasWriteableRT) \|\| (WM_INT::Pixel Shader Computed Depth Mode != PSCDEPTH_OFF && (WM_INT::Depth Test Enable \|\| WM_INT::Depth Write Enable)) \|\| (3DSTATE_PS_EXTRA::Computed Stencil && WM_INT::Stencil Test Enable) \|\| (3DSTATE_WM::EDSC_Mode == 1 && (WM_INT::Depth Test Enable \|\| WM_INT::Depth Write Enable \|\| WM_INT::Stencil Test Enable))) Given that there's no depth/stencil testing, no writeable render target, and the hardware thinks kill pixel doesn't happen, all of these conditions are false. We have to whack some bit to make PS invocations happen. There are many options. Curro suggested using the UAV bit. There's some precedence in doing that - we set it for fragment shaders that do SSBO/image/atomic writes when no color buffer writes are enabled. We can simply include discard here too. Fixes 64 dEQP-GLES31.functional.fbo.no_attachments.* tests. v2: Add a comment suggested and written by Jason Ekstrand. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-28 14:36:47 -07:00
Jason Ekstrand	433cf90650	nir/spirv: Remove the NoContraction hack NIR now just handles this for us by not fusing if the multiply is marked as exact.	2016-03-28 13:07:39 -07:00
Jason Ekstrand	5d9afb65a6	i965/peephole_ffma: Only match a mul+add if none of the ops are exact	2016-03-28 13:07:39 -07:00
Jason Ekstrand	035f66025b	nir/search: Don't match inexact expressions with exact subexpressions In the first pass of implementing exact handling, I made a mistake with search-and-replace. In particular, we only reallly handled exact/inexact on the root of the tree. Instead, we need to check every node in the tree for an exact/inexact match. As an example of this, consider the following GLSL code precise float a = b + c; if (a < 0) { do_stuff(); } In that case, only the add will be declared "exact" and an expression that looks for "b + c < 0" will still match and replace it with "b < -c" which may yield different results. The solution is to simply bail if any of the values are exact when matching an inexact expression.	2016-03-28 13:07:39 -07:00
Rhys Kidd	668b6ddfc5	vc4: Remove unused include from vc4_nir_lower_txf_ms.c Found with grep and inspection. Test compiled on RPi hw. Assists any future effort to remove TGSI as an intermediate stage. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-03-28 11:51:11 -07:00
Adam Jackson	2b8492d63e	glapi/glx: Treat xserver generated targets as .PHONY Meaning, always rebuild them when asked instead of bothering to look at timestamps (and then wondering why nothing happened when you said make). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-28 14:37:12 -04:00
Adam Jackson	c2f0bc2537	glapi/glx: Thunk non-ABI calls through GetProcAddress Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-28 14:37:12 -04:00
Adam Jackson	ce3f0b23d1	glapi/glx: Emit direct GL calls instead of dispatch lookup Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-28 14:28:51 -04:00
Adam Jackson	c0a9cbea4d	glx: Unbreak generating some of the xorg glx headers Broken by: commit `9ace0b5422` Author: Dylan Baker <baker.dylan.c@gmail.com> Date: Wed May 20 15:49:11 2015 -0700 glapi: glX_proto_size.py: use argparse instead of getopt Which changed most, but not all, callers to use --header-tag instead of -h. Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-28 14:28:36 -04:00
Bas Nieuwenhuizen	dd5f0950e4	mesa/st: Fix NULL access if no fragment shader is bound Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-28 18:02:07 +02:00
Rob Clark	b4c72b792c	freedreno/ir3: fix for load_front_face intrinsic Seems like trying to widen in the same instruction as the add.s does a non-sign-extending widen. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-28 10:19:53 -04:00
Rob Clark	3ca034cada	freedreno/ir3: fix compiler warn Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-28 10:19:09 -04:00
Ilia Mirkin	b9f1affb2e	nvc0: make sure to disable fetches from previously-set VBOs when blitting We disable the vertex attributes, but also disable the VBO fetch details as well, just in case. Not known to fix anything. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-28 08:36:34 -04:00
Ilia Mirkin	41100b6b44	nvc0: disable primitive restart and index bias during blits Back in the dawn of time, we used to do immediate uploads for the vertex data, and all was well. However Maxwell dropped support for immediate vertex data, so we started feeding in a VBO (in all cases). But we forgot to disable some things that apply in such cases, specifically primitive restart and index bias. The latter was causing WoW and other Blizzard games trouble as they use a pattern where they draw with a base vertex (aka index bias), followed by texture uploads (aka blits, internally). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <nouveau@karolherbst.de>	2016-03-28 08:35:38 -04:00
Ilia Mirkin	f667d15561	nvc0/ir: fix picking of coordinates from tex instruction for textureGrad On Fermi, there's an argument in front of the coords that combines array and indirect handle, while on Kepler the array and the indirect handle are separate (and in front of the coords). We were previously only accounting for the array bit of it, if there were an indirect access it wouldn't be counted in the formula. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-28 08:35:38 -04:00
Ilia Mirkin	6711f159d9	nv50/ir: saturate depth writes Apparently there's no post-FS clamping logic, so we have to do this by hand. The depth will never be outside of the 0..1 range, even on floating point zeta buffers, so this should be safe. Fixes dEQP-GLES3.functional.fbo.depth.clamp. which tests writing invalid values on various zeta buffer formats. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-28 08:35:38 -04:00
Marek Olšák	6262d6125a	gallium/util: fix up inaccurate behavior of util_framebuffer_state_equal (v2) v2: move the nr_cbufs check above the loop Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)	2016-03-28 00:46:23 +02:00
Marek Olšák	21c479256a	st/mesa: only minify height if target != 1D array in st_finalize_texture The st_texture_object documentation says: "the number of 1D array layers will be in height0" We can't minify that. Spotted by luck. No app is known to hit this issue. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-28 00:44:45 +02:00
Miklós Máté	50d653c2bb	mesa: optimize out the realloc from glCopyTexImagexD() v2: comment about the purpose of the code v3: also compare texFormat, add a perf debug message, formatting fixes Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Miklós Máté	baab345b19	st/mesa: fix handling the fallback texture This fixes crash when post-processing is enabled in SW:KotOR. v2: fix const-ness v3: move assignment into the if() block Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Miklós Máté	920fbecf57	st/mesa: enable GL_ATI_fragment_shader Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Miklós Máté	dee274477f	st/mesa: implement GL_ATI_fragment_shader v2: fix arithmetic for special opcodes, fix fog state, cleanup v3: simplify handling of special opcodes, fix rebinding with different textargets or fog equation, lots of formatting fixes v4: adapt to the compile early, fix later architecture, formatting fixes Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Miklós Máté	d71c1e9e54	program: add ATI_fragment_shader to shader stages list Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Miklós Máté	e2d5a6fac5	mesa: optionally associate a gl_program to ATI_fragment_shader the state tracker will use it Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Edward O'Callaghan	11bd53933e	gallium/p_context.h: Make comment more readable Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 18:03:04 +02:00
Edward O'Callaghan	2df141087a	mesa/st: Remove GLSLVersion clamping While here, remove itermediate glsl_feature_level variable. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 18:00:36 +02:00
Edward O'Callaghan	ca22d2f1fd	radeon/r600: Fix return type in failure branch Commit `d4e847ea` introduced a warning about making an integer from a pointer without a cast, fix it here. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 18:00:35 +02:00
Edward O'Callaghan	1fb05a9a0c	radeon/r600_query.c: Minor style fix Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 18:00:35 +02:00
Dave Airlie	fc3b000fef	virgl: drop next shader property for now. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-26 17:50:32 +10:00
Jason Ekstrand	6d658e9bd5	i965: Allow mul+add fusing again	2016-03-25 21:35:41 -07:00
Jason Ekstrand	fbb9e1f008	spirv/alu: Add support for the NoContraction decoration	2016-03-25 21:35:41 -07:00
Jason Ekstrand	00fa795cd3	spirv/glsl: Add a helper for converting glsl opcodes into nir opcodes This is similar to the way that regular ALU operations are handled.	2016-03-25 21:35:41 -07:00
Jason Ekstrand	98522c1853	nir/spirv: Get rid of the spirv2nir helper binary This was useful once upon a time but now that we have a real Vulkan driver to run our SPIR-V binaries through, there's really no point.	2016-03-25 21:35:41 -07:00
Nanley Chery	0e82896a11	anv/blit2d: Add a function to create an ImageView This function differs from the open-coded implementation in that the ImageView's width is determined by the caller and is not unconditionally set to match the number of texels within the surface's pitch. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-25 17:33:50 -07:00
Nanley Chery	4eab37d6cd	anv/image: Enable specifying a surface's minimum pitch This is required to create multiple, horizontally adjacent, max-width images from one blit2d surface. This is also required for more accurate width specification of surfaces within a larger surface (which is seen as the smaller surface's enclosing region). Note that anv_image_create_info::stride has been unused since commit, `b369389640` . Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-25 17:33:40 -07:00
Timothy Arceri	8683d54d2b	glsl: reduce buffer block duplication This reduces some of the craziness required for handling buffer blocks. The problem is each shader stage holds its own information about a block in memory, we were copying that information to a program wide list but the per stage information remained meaning when a binding was updated we needed to update all versions of it. This changes the per stage blocks to instead point to a single version of the block information in the program list. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-26 09:26:30 +11:00
Jason Ekstrand	38250a9ca3	i965/vec4: Get rid of a stray predicate inverse in opquantizef16 This fixes 30 opquantize CTS tests on HSW	2016-03-25 14:37:37 -07:00
Jason Ekstrand	13bad493b4	nir/algebraic: Get rid of a redundant copy of fdiv lowering	2016-03-25 14:04:05 -07:00
Jason Ekstrand	08fe89864b	nir/algebraic: Add better lowering of ldexp	2016-03-25 14:04:05 -07:00
Jason Ekstrand	b75d770963	nir/builder: Simplify nir_ssa_undef a bit	2016-03-25 14:04:05 -07:00
Jason Ekstrand	ab31951bef	nir/spirv: Use the nir_ssa_undef helper from nir_builder	2016-03-25 14:04:05 -07:00
Jason Ekstrand	d2eee52a65	nir/builder: Add a bit size field to nir_ssa_undef	2016-03-25 14:04:05 -07:00
Jason Ekstrand	b50f7f0011	nir: Add a better comment for INTRINSIC_RANGE	2016-03-25 14:04:05 -07:00
Jason Ekstrand	add8c837b5	nir/glsl: Stop carying a pointer to the nir_shader in the visitor	2016-03-25 14:04:05 -07:00
Brian Paul	a8e5edaadf	st/xa: emit sampler view declarations in shaders Fixes recent regressions with the VMware gallium driver. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2016-03-25 14:53:59 -06:00
Tim Rowley	74a04840e5	swr: [rasterizer jitter] Fix MASKLOADD AVX prototype (float -> i32)	2016-03-25 14:45:40 -05:00
Tim Rowley	93c1a2dedf	swr: [rasterizer core] NUMA optimizations... - Affinitize hot-tile memory to specific NUMA nodes. - Only do BE work for macrotiles assoicated with the numa node	2016-03-25 14:45:40 -05:00
Tim Rowley	090be2e434	swr: [rasterizer jitter] Fix logic bug for alpha-to-coverage.	2016-03-25 14:45:40 -05:00
Tim Rowley	0767e820fd	swr: [rasterizer core] Fix Compute workitem retirement	2016-03-25 14:45:40 -05:00
Tim Rowley	813e89c0cc	swr: [rasterizer core] Cleanup state ring arena after last draw that references it completes Rather than waiting for the API thread to re-use it.	2016-03-25 14:45:40 -05:00
Tim Rowley	83822d7ed5	swr: [rasterizer jitter] add missing include for llvm jitevents	2016-03-25 14:45:40 -05:00
Tim Rowley	51549912d1	swr: [rasterizer core] Reduce Arena blocksize to 128KB (from 1MB). With global allocator this doesn't seem to affect performance at all. Overall memory consumption drops by up to 85%.	2016-03-25 14:45:40 -05:00
Tim Rowley	ed5b953919	swr: [rasterizer core] One last pass at Arena optimizations	2016-03-25 14:45:40 -05:00
Tim Rowley	ee6be9e92d	swr: [rasterizer core] CachedArena optimizations Reduce list traversal during Alloc and Free. Add ability to have multiple lists based on alloc size (not used for now)	2016-03-25 14:45:39 -05:00
Tim Rowley	68314b6769	swr: [rasterizer jitter] support llvm-svn	2016-03-25 14:45:39 -05:00
Tim Rowley	ec9d4c4b37	swr: [rasterizer core] Globally cache allocated arena blocks for fast re-allocation.	2016-03-25 14:45:39 -05:00
Tim Rowley	12ce9d9aa1	swr: [rasterizer] more arena work	2016-03-25 14:45:39 -05:00
Tim Rowley	4893224e28	swr: [rasterizer core] Add clipping against user clip distances in the NullPS backend.	2016-03-25 14:45:39 -05:00
Tim Rowley	700a5b06e0	swr: [rasterizer core] Arena optimizations - preparing for global allocator.	2016-03-25 14:45:39 -05:00
Tim Rowley	5899076b6b	swr: [rasterizer core] Reset DrawContext arena at end of draw rather than upon reclaim of DC Keeps overall memory consumption lower. Also, remove unused knobs.	2016-03-25 14:45:39 -05:00
Tim Rowley	7390418441	swr: [rasterizer core] Add clipping of user clip planes in clipper.	2016-03-25 14:45:39 -05:00
Tim Rowley	4b4547a721	swr: [rasterizer] Reduce max in-flight draws to 96 (by default)	2016-03-25 14:45:39 -05:00
Tim Rowley	9111d63228	swr: [rasterizer] Fix run-time check asserts One innocuous (uninitialized variable), and one not so innocuous (stack corruption).	2016-03-25 14:45:39 -05:00
Tim Rowley	257db3610a	swr: [rasterizer jitter] signed immediate builder	2016-03-25 14:45:39 -05:00
Tim Rowley	b958aea78a	swr: [rasterizer common] changes for cygwin	2016-03-25 14:45:39 -05:00
Tim Rowley	e1222ade00	swr: [rasterizer] code styling and update copyrights	2016-03-25 14:45:14 -05:00
Tim Rowley	c75314ec67	swr: [rasterizer core] Guard against enquing work to invalid hot tiles	2016-03-25 14:43:15 -05:00
Tim Rowley	fee56fda6f	swr: [rasterizer] Stop setting viewport size to larger than hottile array Guard against enquing work to invalid tiles	2016-03-25 14:43:14 -05:00
Tim Rowley	e374d2d24b	swr: [rasterizer] Discard work + misc fixes	2016-03-25 14:43:14 -05:00
Tim Rowley	542d7dec7b	swr: [rasterizer] remove use of BYTE type	2016-03-25 14:43:14 -05:00
Tim Rowley	be4c558d01	swr: [rasterizer core] Fix crash that can occur when switching contexts	2016-03-25 14:43:14 -05:00
Tim Rowley	51a11658d9	swr: [rasterizer] remove unused knob	2016-03-25 14:43:14 -05:00
Tim Rowley	61beaa2279	swr: [rasterizer core] subcontext rework	2016-03-25 14:43:14 -05:00
Tim Rowley	0c18900cfb	swr: [rasterizer common] add _simd_s[rl]lv_epi32	2016-03-25 14:43:14 -05:00
Tim Rowley	bef222db22	swr: [rasterizer core] Alleviate potential stack overflow for 32bit builds Move large stack allocations in the GS and clipper into thread local storage.	2016-03-25 14:43:14 -05:00
Tim Rowley	3132f731f8	swr: [rasterizer] remove use of UCHAR and UINT64 types	2016-03-25 14:43:14 -05:00
Tim Rowley	643857f596	swr: [rasterizer] remove use of FLOAT type	2016-03-25 14:43:14 -05:00
Tim Rowley	3252fe3705	swr: [rasterizer] Fix Coverity issues reported by Mesa developers.	2016-03-25 14:43:14 -05:00
Tim Rowley	45d52673c2	swr: [rasterizer] add debug/perf category to knobs	2016-03-25 14:43:13 -05:00
Tim Rowley	1da9c8a970	swr: [rasterizer core] don't assume linux is 64-bit	2016-03-25 14:43:13 -05:00
Tim Rowley	49678803f7	swr: [rasterizer common] remove old unused win32 types	2016-03-25 14:43:13 -05:00
Tim Rowley	aca5513184	swr: [rasterizer jitter] vpermps support	2016-03-25 14:43:13 -05:00
Tim Rowley	bfb954189e	swr: [rasterizer] Add rdtsc buckets support for shaders Pass pointer to core buckets mgr back to sim layer. Add support for RDTSC_START/RDTSC_STOP macros in the builder. Each unique shader now has a unique bucket associated with it, enabling more detailed reporting at the shader level. Currently due to some llvm issue with thread local storage, 64bit runs require single threaded mode.	2016-03-25 14:43:13 -05:00
Tim Rowley	abd4aa68cc	swr: [rasterizer core] backend reorganization	2016-03-25 14:43:13 -05:00
Tim Rowley	13303f3320	swr: [rasterizer core] store blend output in temporary instead of PS output. Fixes additive blend problem with MSAA	2016-03-25 14:26:17 -05:00
Tim Rowley	3f4fba3772	swr: [rasterizer core] Move InitializeHotTiles and corresponding clear code out of threads.cpp.	2016-03-25 14:26:17 -05:00
Tim Rowley	bdd690dc36	swr: [rasterizer jitter] Cleanup use of types inside of Builder. Also, cached the simd width since we don't have to keep querying the JitManager for it.	2016-03-25 14:26:17 -05:00
Tim Rowley	7ead4959a5	swr: [rasterizer jitter] Fix type mismatch on select args for SCATTERPS	2016-03-25 14:26:17 -05:00
Tim Rowley	136988b42b	swr: [rasterizer core] fix rasterizing multisampling with scissor enabled We were not evaluating the scissor edge equations at sample positions.	2016-03-25 14:26:17 -05:00
Tim Rowley	45f0ce168c	swr: [rasterizer core] RingBuffer class for DC/DS Use head/tail ring buffer indices for thread synchronization. 1. SwrWaitForIdle loops until ring is empty. (head == tail) 2. GetDrawContext waits until ring is not full. (head - tail) == Ring Size 3. Draw enqueues by incrementing head. 4. Last worker thread to move past a DC dequeues by incrementing tail. Todo: To reduce contention we can cache the tail in the API thread. For example, if you know you have 64 free entries in the ring then you don't need to keep checking the tail until you used those 64 entries.	2016-03-25 14:26:17 -05:00
Tim Rowley	dd0f9eed8c	swr: [rasterizer] switch assert uses to SWR_ASSERT	2016-03-25 14:26:16 -05:00
Tim Rowley	45a4afa634	swr: [rasterizer core] Split all RECT_LIST draws into 1 RECT per draw Needed until proper RECT_LIST PrimAssembly code is written.	2016-03-25 14:26:16 -05:00
Tim Rowley	3a25185990	swr: [rasterizer] Add string knob type	2016-03-25 14:26:16 -05:00
Jordan Justen	8f3c236674	anv: Use genxml register support for L3 Cache config The programming of the L3 Cache registers should match the previous manually packed LRI values. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-25 00:19:18 -07:00
Jordan Justen	7a03fb9ccb	genxml: Add L3 Cache Control register definitions Based on intel_reg.h (`5912da45a6`) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 23:49:53 -07:00
Jordan Justen	d353ba8f5f	anv: Add genxml register support Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 23:49:53 -07:00
Jordan Justen	b332013a56	genxml: Add register support Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 23:46:59 -07:00
Sonny Jiang	f00c840578	radeonsi: add Polaris PCI IDs Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (Polaris10) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (Polaris11)	2016-03-24 23:08:12 -04:00
Sonny Jiang	f87ed903fb	radeon/vce: disable two pipe mode for Polaris11 Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-24 23:08:04 -04:00
Sonny Jiang	0c5477465f	radeon/vce: add Polaris11 VCE firmware support Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>	2016-03-24 23:07:53 -04:00
Sonny Jiang	42e442d888	radeonsi: add support for Polaris (v2) v2: Polaris chips should be defined after Stoney Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> (v1) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1) Signed-off-by: Leo Liu <leo.liu@amd.com> (v2 diff) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v2 diff)	2016-03-24 23:07:32 -04:00
Sonny Jiang	f5e24b19e8	winsys/amdgpu: addrlib - add Polaris support (v2) v2: fix indentation as noted by Michel Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-24 23:06:39 -04:00
Jason Ekstrand	2c3f95d6aa	Merge remote-tracking branch 'public/master' into vulkan	2016-03-24 17:30:14 -07:00
Kenneth Graunke	511ce2925b	mesa: Check glReadBuffer enums against the ES3 table. From the ES 3.2 spec, section 16.1.1 (Selecting Buffers for Reading): "An INVALID_ENUM error is generated if src is not BACK or one of the values from table 15.5." Table 15.5 contains NONE and COLOR_ATTACHMENTi. Mesa properly returned INVALID_ENUM for unknown enums, but it decided what was known by using read_buffer_enum_to_index, which handles all enums in every API. So enums that were valid in GL were making it past the "valid enum" check. Such targets would then be classified as unsupported, and we'd raise INVALID_OPERATION, but that's technically the wrong error code. Fixes dEQP-GLES31's functional.debug.negative_coverage.get_error.buffer.read_buffer v2: Only call read_buffer_enuM_to_index when required (Eduardo). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-24 16:52:08 -07:00
Nanley Chery	a5dc3c0f02	anv: Sanitize Image extents and offsets Prepare Image extents and offsets for internal consumption by assigning the default values implicitly defned by the spec. Fixes textures on several Vulkan demos in which the VkImageCopy depth is set to zero when copying a 2D image. v2 (Jason Ekstrand): Replace "prep" with "sanitize" Make function static inline Pass structs instead of pointers Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2016-03-24 16:15:00 -07:00
Jason Ekstrand	22b343a8ec	nir: Add a pass to inline functions This commit adds a new NIR pass that lowers all function calls away by inlining the functions. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	debf23ec68	nir/builder: Add helpers for easily inserting copy_var intrinsics Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	79dec93ead	nir: Add return lowering pass This commit adds a NIR pass for lowering away returns in functions. If the return is in a loop, it is lowered to a break. If it is not in a loop, it's lowered away by moving/deleting code as needed. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	8d61d72524	nir: Add a cursor helper for getting a cursor after any phi nodes Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	18b0166749	nir/builder: Add a helper for inserting jump instructions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	97b663481c	nir/cf: Make extracting or re-inserting nothing a no-op Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	7022a673cd	nir: Add a function for comparing cursors Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	124f229ece	nir/cf: Handle relinking top-level blocks This can happen if a function ends in a return instruction and you remove the return. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	364212f1ed	nir: Add a pass to repair SSA form Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	ea98d415e4	nir/vars_to_ssa: Use the new nir_phi_builder helper The efficiency should be approximately the same. We do a little more work per phi node because we have to sort the predecessors. However, we no longer have to walk the blocks a second time to pop things off the stack. The bigger advantage, however, is that we can now re-use the phi placement and per-block SSA value tracking in other passes. As a side-benifit, the phi builder actually handles unreachable blocks correctly. The original vars_to_ssa code, because of the way it iterated the blocks and added phi sources, didn't add sources corresponding to predecessors of unreachable blocks. The new strategy employed by the phi builder creates a phi source for each predecessor and should correctly handle unreachable blocks by setting those sources to SSA undefs. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	42ddfc611f	nir/dominance: Handle unreachable blocks Previously, nir_dominance.c didn't properly handle unreachable blocks. This can happen if, for instance, you have something like this: loop { if (...) { break; } else { break; } } In this case, the block right after the if statement will be unreachable. This commit makes two changes to handle this. First, it removes an assert and allows block->imm_dom to be null if the block is unreachable. Second, it properly skips unreachable blocks in calc_dom_frontier_cb. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	e4dc82cfcf	nir: Add a phi node placement helper Right now, we have phi placement code in two places and there are other places where it would be nice to be able to do this analysis. Instead of repeating it all over the place, this commit adds a helper for placing all of the needed phi nodes for a value. v2: Add better documentation Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	9a41d94731	util/bitset: Allow iterating over const bitsets Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Rob Clark	61c7d20e4f	ttn: remove stray global from header Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-24 16:04:54 -04:00
Samuel Pitoiset	b9c70fcdad	nv50/ir: silence unhandled TGSI_PROPERTY_NEXT_SHADER info radeonsi uses this property to make the best decision about which shader to compile, but this is not currently used by our codegen. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-24 18:53:24 +01:00
Kenneth Graunke	d1bb1df87e	mesa: Handle negative length in glPushDebugGroup(). The KHR_debug spec doesn't actually say we should handle this, but that is most likely an oversight - it says to check against strlen and generate errors if length is negative. It appears they just forgot to explicitly spell out that we should then proceed to actually handle it. Fixes crashes from uncaught std::string exceptions in many dEQP-GLES31.functional.debug.error_filters.* tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-24 10:47:50 -07:00
Kenneth Graunke	028459a00d	mesa: Make glDebugMessageInsert deal with negative length for all types. From the KHR_debug spec, section 5.5.5 (Externally Generated Messages): "If <length> is negative, it is implied that <buf> contains a null terminated string. The error INVALID_VALUE will be generated if the number of characters in <buf>, excluding the null terminator when <length> is negative, is not less than the value of MAX_DEBUG_MESSAGE_LENGTH." This indicates that length should be set to strlen for all types, not just GL_DEBUG_TYPE_MARKER. We want it to be after validate_length() so we still generate appropriate errors. Fixes crashes from uncaught std::string exceptions in many dEQP-GLES31.functional.debug.error_filters.* tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-24 10:47:45 -07:00
Kenneth Graunke	412e686da9	mesa: Include null terminator in GL_DEBUG_NEXT_LOGGED_MESSAGE_LENGTH. From the KHR_debug spec: "Applications can query the number of messages currently in the log by obtaining the value of DEBUG_LOGGED_MESSAGES, and the string length (including its null terminator) of the oldest message in the log through the value of DEBUG_NEXT_LOGGED_MESSAGE_LENGTH." Because we weren't including the null terminator, many dEQP tests called glGetDebugMessageLog with a bufSize parameter that was 1 too small, and unable to contain the message, so we skipped returning it, failing many cases. Fixes 298 dEQP-GLES31.functional.debug.* tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-24 10:47:29 -07:00
Nicolai Hähnle	6b763c026d	st/mesa: use RGBA instead of BGRA for SRGB_ALPHA This fixes a regression introduced by commit `a8eea696` "st/mesa: honour sized internal formats in st_choose_format (v2)". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94657 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94671 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-24 12:23:31 -05:00
Nicolai Hähnle	7880b81d39	radeonsi: silence a coverity warning The following Coverity warning 5378 tmpl.fetch_args = atomic_fetch_args; 5379 tmpl.emit = atomic_emit; >>> CID 1357115: Uninitialized variables (UNINIT) >>> Using uninitialized value "tmpl". Field "tmpl.intr_name" is uninitialized. 5380 bld_base->op_actions[TGSI_OPCODE_ATOMUADD] = tmpl; 5381 bld_base->op_actions[TGSI_OPCODE_ATOMUADD].intr_name = "add"; ... is a false positive, but what the hell. This change should "fix" it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-24 12:23:14 -05:00
Bas Nieuwenhuizen	f96309753b	mesa: replace gl_context->Multisample._Enabled with _mesa_is_multisample_enabled. This removes any dependency on driver validation of the number of framebuffer samples. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Brian Paul <brianp@vmware.com>	2016-03-24 08:36:43 -06:00
Rob Clark	0bea0e7141	nir: fix dangling ssadef->name ptrs In many places, the convention is to pass an existing ssadef name ptr when construction/initializing a new nir_ssa_def. But that goes badly (as noticed by garbage in nir_print output) when the original string gets freed. Just use ralloc_strdup() instead, and add ralloc_free() in the two places that would care (not that the strings wouldn't eventually get freed anyways). Also fixup the nir_search code which was directly setting ssadef->name to use the parent instruction as memctx. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-24 08:30:04 -04:00
Jason Ekstrand	4e060d80ff	glsl: Add propagate_invariance to the other makefile This fixes the scons build	2016-03-23 21:12:44 -07:00
Jason Ekstrand	a984e44abd	nir/glsl: Propagate invariant into NIR alu ops Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:07 -07:00
Jason Ekstrand	028d6ecfe0	glsl/rebalance_tree: Don't handle invariant or precise trees Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:07 -07:00
Jason Ekstrand	b2209b2333	glsl/opt_algebraic: Don't handle invariant or precise trees Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:07 -07:00
Jason Ekstrand	89b604922d	glsl: Add a pass to propagate the "invariant" and "precise" qualifiers Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	91d6272c2b	nir/alu_to_scalar: Propagate the "exact" bit Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	865e83b9ec	i965/peephole_ffma: Don't fuse exact adds Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	5f39e3e165	nir/cse: Properly handle nir_ssa_def.exact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	0dbda153aa	nir/algebraic: Flag inexact optimizations Many of our optimizations, while great for cutting shaders down to size, aren't really precision-safe. This commit tries to flag all of the inexact floating-point optimizations so they don't get run on values that are flagged "exact". It's a bit conservative and maybe flags some safe optimizations as unsafe but that's better than missing one. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:02 -07:00
Jason Ekstrand	ed3a029e80	nir/algebraic: Fix fmin detection to match the spec The previous transformation got the arguments to fmin backwards. When NaNs are involved, the GLSL min/max aren't commutative so it matters. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:00 -07:00
Jason Ekstrand	89545b1314	nir/algebraic: Get rid of an invlid fxor optimization The fxor opcode is required to return 1.0f or 0.0f but the input variable may not be 1.0f or 0.0f. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:58 -07:00
Jason Ekstrand	3a7cb6534c	nir/algebraic: Allow for flagging operations as being inexact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:55 -07:00
Jason Ekstrand	a6f25fa7d7	nir/search: Propagate exactness into newly created expressions Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:52 -07:00
Jason Ekstrand	ded3133d47	nir/builder: Add a flag for setting exact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:26:34 -07:00
Jason Ekstrand	4ff89377d9	nir: Add an "exact" bit to nir_alu_instr Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:26:34 -07:00
Jason Ekstrand	f849f53990	nir/clone: Export nir_variable_clone Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:26:11 -07:00
Jason Ekstrand	5fe8959912	nir/clone: Expose nir_constant_clone Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:26:08 -07:00
Jason Ekstrand	c4c373f156	nir: Fix whitespace Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:25:53 -07:00
Brian Paul	9a6da49371	docs: use latest libDRM version Signed-off-by: Brian Paul <brianp@vmware.com>	2016-03-23 12:56:32 -06:00
Lars Hamre	43c6f3f82f	compiler/glsl: allow sequence op as a const expr in gles 1.0 Allow the sequence operator to be a constant expression in GLSL ES versions prior to GLSL ES 3.0 Fixes the following piglit test: /all/spec/glsl-es-1.0/compiler/array-sized-by-sequence-in-parenthesis.vert This is similar to the logic from process_initializer() which performs the same check for constant variable initialization with sequence operators. v2: Fixed regression pointed out by Eduardo Lima Mitev Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-23 18:13:26 +01:00
Nicolai Hähnle	c4931ae174	radeonsi: fix out-of-bounds indexing of shader images Results are undefined but may not crash. Without this change, out-of-bounds indexing can lead to VM faults and GPU hangs. Constant buffers, samplers, and possibly others will eventually need similar treatment to support GL_ARB_robust_buffer_access_behavior. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-23 11:49:53 -05:00
Nicolai Hähnle	a8f5d11426	radeonsi: cache flush/invalidation for missing PIPE_BARRIER_*_BUFFER bits (v2) This fixes arb_shader_image_load_store-host-mem-barrier. v2: flush TC L2 for index buffers on <= CIK (Marek) Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-23 11:48:19 -05:00
Nicolai Hähnle	fc94bc2986	st/mesa: add missing MemoryBarrier bits and some explanations Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-23 11:48:15 -05:00
Nicolai Hähnle	b15b1faefd	gallium: add PIPE_BARRIER_STREAMOUT_BUFFER Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-23 11:48:02 -05:00
Marek Olšák	b8ec205515	radeonsi: fix 2D array MSAA failures since image support landed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-23 12:14:15 +01:00
Jason Ekstrand	9881eab197	i965/fs: Don't constant-fold RCP No shader-db changes on Broadwell Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 16:46:15 -07:00
Jason Ekstrand	01425c45b3	i965: Remove the RCP+RSQ algebraic optimizations NIR already has this optimization and it can do much better than the little peephole in the backend. No shader-db change on Haswell or Broadwell. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 16:46:15 -07:00
Jason Ekstrand	20417b2cb0	anv/device: Advertise version 1.0.5 Nothing substantial has changed since 1.0.2	2016-03-22 16:21:23 -07:00
Jason Ekstrand	204d937ac2	anv/device: Ignore the patch portion of the requested API version Fixes dEQP-VK.api.device_init.create_instance_name_version Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94661	2016-03-22 16:20:45 -07:00
Jason Ekstrand	4844723405	anv: Don't assert-fail if someone asks for a non-existent entrypoint	2016-03-22 16:11:53 -07:00
Jason Ekstrand	8dd86e8aa7	Update to the latest Vulkan header from Khronos	2016-03-22 16:06:53 -07:00
Ian Romanick	d7a25a9def	nir: Don't abs slt and friends No shader-db changes, but this is symmetric with the previous commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:48:02 -07:00
Ian Romanick	2bb006af68	nir: Don't abs the result of b2f or b2i In the results below, 2 SIMD16 shaders in Trine are lost. G4X total instructions in shared programs: 4012279 -> 4011108 (-0.03%) instructions in affected programs: 116776 -> 115605 (-1.00%) helped: 339 HURT: 0 total cycles in shared programs: 84315862 -> 84313584 (-0.00%) cycles in affected programs: 1767232 -> 1764954 (-0.13%) helped: 274 HURT: 81 Ironlake total instructions in shared programs: 6399073 -> 6396998 (-0.03%) instructions in affected programs: 218050 -> 215975 (-0.95%) helped: 600 HURT: 0 total cycles in shared programs: 128892088 -> 128888810 (-0.00%) cycles in affected programs: 2867452 -> 2864174 (-0.11%) helped: 422 HURT: 137 Sandy Bridge total instructions in shared programs: 8462174 -> 8460759 (-0.02%) instructions in affected programs: 178529 -> 177114 (-0.79%) helped: 596 HURT: 0 total cycles in shared programs: 117542276 -> 117534098 (-0.01%) cycles in affected programs: 1239166 -> 1230988 (-0.66%) helped: 369 HURT: 150 Ivy Bridge total instructions in shared programs: 7775131 -> 7773410 (-0.02%) instructions in affected programs: 162903 -> 161182 (-1.06%) helped: 590 HURT: 0 total cycles in shared programs: 65759882 -> 65747268 (-0.02%) cycles in affected programs: 1004354 -> 991740 (-1.26%) helped: 467 HURT: 141 Haswell total instructions in shared programs: 7107786 -> 7106327 (-0.02%) instructions in affected programs: 140954 -> 139495 (-1.04%) helped: 590 HURT: 0 total cycles in shared programs: 64668028 -> 64655322 (-0.02%) cycles in affected programs: 967080 -> 954374 (-1.31%) helped: 452 HURT: 149 LOST: 2 GAINED: 0 Broadwell total instructions in shared programs: 8980029 -> 8978287 (-0.02%) instructions in affected programs: 197232 -> 195490 (-0.88%) helped: 715 HURT: 0 total cycles in shared programs: 70070448 -> 70055970 (-0.02%) cycles in affected programs: 975724 -> 961246 (-1.48%) helped: 471 HURT: 111 LOST: 2 GAINED: 0 Skylake total instructions in shared programs: 9115178 -> 9113436 (-0.02%) instructions in affected programs: 203012 -> 201270 (-0.86%) helped: 715 HURT: 0 total cycles in shared programs: 68848660 -> 68834004 (-0.02%) cycles in affected programs: 993888 -> 979232 (-1.47%) helped: 473 HURT: 116 LOST: 2 GAINED: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:48:02 -07:00
Ian Romanick	348e5a71d8	nir: Simplify 0 < fabs(a) Sandy Bridge / Ivy Bridge / Haswell total instructions in shared programs: 8462180 -> 8462174 (-0.00%) instructions in affected programs: 564 -> 558 (-1.06%) helped: 6 HURT: 0 total cycles in shared programs: 117542462 -> 117542276 (-0.00%) cycles in affected programs: 9768 -> 9582 (-1.90%) helped: 12 HURT: 0 Broadwell / Skylake total instructions in shared programs: 8980833 -> 8980826 (-0.00%) instructions in affected programs: 626 -> 619 (-1.12%) helped: 7 HURT: 0 total cycles in shared programs: 70077900 -> 70077714 (-0.00%) cycles in affected programs: 9378 -> 9192 (-1.98%) helped: 12 HURT: 0 G45 and Ironlake showed no change. v2: Modify the comments to look more like a proof. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:47:56 -07:00
Ian Romanick	564a8b8a26	nir: Simplify 0 >= b2f(a) This also prevented some regressions with other patches in my local tree. Broadwell / Skylake total instructions in shared programs: 8980835 -> 8980833 (-0.00%) instructions in affected programs: 45 -> 43 (-4.44%) helped: 1 HURT: 0 total cycles in shared programs: 70077904 -> 70077900 (-0.00%) cycles in affected programs: 122 -> 118 (-3.28%) helped: 1 HURT: 0 No changes on earlier platforms. v2: Modify the comments to look more like a proof. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:44:57 -07:00
Ian Romanick	bf0d60aa11	nir: Simplify i2b with negated or abs operand This enables removing ssa_201 and ssa_202 in sequences like: vec1 ssa_200 = flt ssa_199, ssa_194 vec1 ssa_201 = b2i ssa_200 vec1 ssa_202 = i2b -ssa_201 shader-db results: Sandy Bridge total instructions in shared programs: 8462257 -> 8462180 (-0.00%) instructions in affected programs: 3846 -> 3769 (-2.00%) helped: 35 HURT: 0 total cycles in shared programs: 117542934 -> 117542462 (-0.00%) cycles in affected programs: 20072 -> 19600 (-2.35%) helped: 20 HURT: 1 Ivy Bridge total instructions in shared programs: 7775252 -> 7775137 (-0.00%) instructions in affected programs: 3645 -> 3530 (-3.16%) helped: 35 HURT: 0 total cycles in shared programs: 65760522 -> 65760068 (-0.00%) cycles in affected programs: 21082 -> 20628 (-2.15%) helped: 25 HURT: 2 Haswell total instructions in shared programs: 7108666 -> 7108589 (-0.00%) instructions in affected programs: 3253 -> 3176 (-2.37%) helped: 35 HURT: 0 total cycles in shared programs: 64675726 -> 64675272 (-0.00%) cycles in affected programs: 21034 -> 20580 (-2.16%) helped: 26 HURT: 1 Broadwell / Skylake total instructions in shared programs: 8980912 -> 8980835 (-0.00%) instructions in affected programs: 3223 -> 3146 (-2.39%) helped: 35 HURT: 0 total cycles in shared programs: 70077926 -> 70077904 (-0.00%) cycles in affected programs: 21886 -> 21864 (-0.10%) helped: 21 HURT: 6 G45 and Ironlake showed no change. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:43:28 -07:00
Ian Romanick	a4079f1cb2	nir: Lower flrp with Boolean interpolator to bcsel On Intel platforms that don't set lower_flrp, using bcsel instead of flrp seems to be a small amount worse. On those platforms, the use of flrp, bcsel, and multiply of b2f is still an active area of research. In review, Matt suggested this is because bcsel turns into CMP+SEL, and because of the flag register we can't schedule instructions well. shader-db results: G4X / Ironlake total instructions in shared programs: 4016538 -> 4012279 (-0.11%) instructions in affected programs: 161556 -> 157297 (-2.64%) helped: 1077 HURT: 1 total cycles in shared programs: 84328296 -> 84315862 (-0.01%) cycles in affected programs: 4174570 -> 4162136 (-0.30%) helped: 926 HURT: 53 Unsurprisingly, no changes on later platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:42:42 -07:00
Ian Romanick	9442db4f89	i965: Have NIR lower flrp on pre-GEN6 vec4 backend Previously we were doing the lowering by hand in vec4_visitor::emit_lrp. By doing it in NIR, we have the opportunity for NIR to do additional optimization of the expanded code. This also enables optimizations added by the next commit. shader-db results: G4X / Ironlake total instructions in shared programs: 4024401 -> 4016538 (-0.20%) instructions in affected programs: 447686 -> 439823 (-1.76%) helped: 2623 HURT: 0 total cycles in shared programs: 84375846 -> 84328296 (-0.06%) cycles in affected programs: 16964960 -> 16917410 (-0.28%) helped: 2556 HURT: 41 Unsurprisingly, no changes on later platforms. v2: Formatting and comment changes suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:42:42 -07:00
Brian Paul	18c5fa1122	swrast: fix discarded const warning in s_texture.c Signed-off-by: Brian Paul <brianp@vmware.com>	2016-03-22 08:35:27 -06:00
Marc-André Lureau	530593da65	i965: fix invalid memory write I noticed some heap corruption running virgl tests, and valgrind helped me to track it down to the following error: ==29272== Invalid write of size 4 ==29272== at 0x90283D4: push_loop_stack (brw_eu_emit.c:1307) ==29272== by 0x9029A7D: brw_DO (brw_eu_emit.c:1750) ==29272== by 0x90554B0: fs_generator::generate_code(cfg_t const, int) (brw_fs_generator.cpp:1999) ==29272== by 0x904491F: brw_compile_fs (brw_fs.cpp:5685) ==29272== by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137) ==29272== by 0x8FC7663: brw_fs_precompile (brw_wm.c:638) ==29272== by 0x8FA4040: brw_shader_precompile(gl_context, gl_shader_program) (brw_link.cpp:51) ==29272== by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260) ==29272== by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006) ==29272== by 0x8C84325: _mesa_link_program (shaderapi.c:1042) ==29272== by 0x8C851D7: _mesa_LinkProgram (shaderapi.c:1515) ==29272== by 0x4E4B8E8: add_shader_program (vrend_renderer.c:880) ==29272== Address 0xf2f3cb0 is 0 bytes after a block of size 112 alloc'd ==29272== at 0x4C2AA98: calloc (vg_replace_malloc.c:711) ==29272== by 0x8ED11F7: ralloc_size (ralloc.c:113) ==29272== by 0x8ED1282: rzalloc_size (ralloc.c:134) ==29272== by 0x8ED14C0: rzalloc_array_size (ralloc.c:196) ==29272== by 0x9019C7B: brw_init_codegen (brw_eu.c:291) ==29272== by 0x904F565: fs_generator::fs_generator(brw_compiler const, void, void, void const, brw_stage_prog_data, unsigned int, bool, gl_shader_stage) (brw_fs_generator.cpp:124) ==29272== by 0x9044883: brw_compile_fs (brw_fs.cpp:5675) ==29272== by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137) ==29272== by 0x8FC7663: brw_fs_precompile (brw_wm.c:638) ==29272== by 0x8FA4040: brw_shader_precompile(gl_context, gl_shader_program) (brw_link.cpp:51) ==29272== by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260) ==29272== by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006) if_depth_in_loop is an array of size p->loop_stack_array_size, and push_loop_stack() will access if_depth_in_loop[p->loop_stack_depth+1], thus the condition to grow the array should be p->loop_stack_array_size <= (p->loop_stack_depth + 1) (it's currently off by 2...) This can be reproduced by running the following test with virgl test server: LIBGL_ALWAYS_SOFTWARE=y GALLIUM_DRIVER=virpipe bin/shader_runner ./tests/shaders/glsl-fs-unroll-explosion.shader_test -auto Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-21 20:50:07 -07:00
Dave Airlie	53afbc980a	tgsi: drop unused set_exec/kill_mask interfaces. These don't get used and haven't been in git history from what I can see, so drop them. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-22 13:07:05 +10:00
Dave Airlie	1e8435ce0c	docs/relnotes: update ARB_internalformat_query2 status. Signed-off-by: Dave Airlie <Airlied@redhat.com>	2016-03-22 09:54:08 +10:00
Dave Airlie	ee7c8b9804	st/mesa: add support for internalformat query2. Add code to handle GL_INTERNALFORMAT_PREFERRED. Add code to deal with GL_RENDERBUFFER being passes into ChooseTextureFormat. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-22 09:49:08 +10:00
Jason Ekstrand	869e393eb3	anv/batch_chain: Fall back to growing batches when chaining isn't available	2016-03-21 15:29:30 -07:00
Anuj Phogat	4ba47f7b2a	i965: Fix assert conditions for src/dst x/y offsets Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-21 14:55:18 -07:00
Anuj Phogat	65cd2f8443	swrast: Move assert for 'slice' in to check_map_teximage Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-03-21 14:55:18 -07:00
xavier	fce0b55ccb	r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications. Previously it was doing this transformation for a Trine 3 shader: MUL R6.x.12, R13.x.23, 0.5\|3f000000 - MULADD R4.x.12, -R6.x.12, 2\|40000000, 1\|3f800000 + MULADD R4.x.12, -R13.x.23, -1\|bf800000, 1\|3f800000 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94412 Signed-off-by: Xavier Bouchoux <xavierb@gmail.com> Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-22 07:43:13 +10:00
Samuel Pitoiset	9efd8b590f	nvc0: make sure to delete samplers used by compute shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-21 22:04:18 +01:00
Kenneth Graunke	4b0a5b21ae	i965/blorp: Make BlitFramebuffer() do sRGB encoding in ES 3.x. According to the ES 3.0 and GL 4.4 specifications, glBlitFramebuffer is supposed to perform sRGB decoding and encoding whenever sRGB formats are in use. The ES 3.0 specification is completely clear, and has always stated this. However, the GL specification has changed behavior in 4.1, 4.2, and 4.4. The original behavior stated that no sRGB encoding should occur. The 4.4 behavior matches ES 3.0's wording. However, implementing the new behavior appears to break applications such as Left 4 Dead 2. This patch changes Meta to apply the ES 3.x rules in ES 3.x, but leaves OpenGL alone for now, to avoid breaking applications. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-21 13:55:32 -07:00
Kenneth Graunke	8679bb7c9e	i965/blorp: Refactor sRGB encoding/decoding. Because the rules for sRGB are so insane, we change brw_blorp_miptrees to take decode_srgb and encode_srgb flags, which control linearization of the source and destination separately. This should make it easy to implement whatever crazy combination of rules people throw at us. For now, it should be equivalent. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-21 13:54:29 -07:00
Kenneth Graunke	eee8a53906	meta: Make BlitFramebuffer() do sRGB encoding in ES 3.x. According to the ES 3.0 and GL 4.4 specifications, glBlitFramebuffer is supposed to perform sRGB decoding and encoding whenever sRGB formats are in use. The ES 3.0 specification is completely clear, and has always stated this. However, the GL specification has changed behavior in 4.1, 4.2, and 4.4. The original behavior stated that no sRGB encoding should occur. The 4.4 behavior matches ES 3.0's wording. However, implementing the new behavior appears to break applications such as Left 4 Dead 2. This patch changes Meta to apply the ES 3.x rules in ES 3.x, but leaves OpenGL alone for now, to avoid breaking applications. Meta implements several other functions in terms of BlitFramebuffer, and many of those explicitly do not perform sRGB encoding. So, this patch explicitly disables sRGB encoding in those other functions, preserving the existing (correct) behavior. If you're from the future and are reading this, hi! Welcome to the "fun" of debugging sRGB problems! Best of luck! Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-21 13:53:44 -07:00
Nicolai Hähnle	b74784638d	docs: mark GL_ARB_shader_image_load_store/_size as done for radeonsi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:26 -05:00
Edward O'Callaghan	5219eb15e1	radeonsi: Set PIPE_SHADER_CAP_MAX_SHADER_IMAGES This enables ARB_shader_image_load_store and ARB_shader_image_size. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> [allow the same number of images for all shader stages and require LLVM 3.9] Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:26 -05:00
Nicolai Hähnle	6f942ac5ee	radeonsi: disable early Z if the fragment shader writes to memory Empirically, both the EXEC_ON_* flags and LATE_Z are necessary. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	79762e877c	tgsi/scan: add writes_memory to flag presence of stores or atomics Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	e9d935ed0e	radeonsi: force the DCC enable bit off in image descriptors for writing (v2) This avoids a lockup at least on Tonga. v2: only force DCC off on VI+ (Marek) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	43f5ce1d20	radeonsi: implement MemoryBarrier (v2) v2: invalidate both constant and VMEM/TC L1 for constant buffers (Marek) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	97352aa50a	radeonsi: implement volatile memory access Prevent loads from being re-ordered or coalesced. Atomics don't need special handling by definition, and stores don't need special handling because LLVM is unable to detect dead image or buffer stores. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	5a61b428f4	radeonsi: implement coherent memory access (v2) v2: set glc=1 for volatile also on buffers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	d6fa650454	radeonsi: Lower TGSI_OPCODE_MEMBAR down to LLVM op Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	f7a85a8a0a	radeonsi: Lower TGSI_OPCODE_ATOM* down to LLVM op Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:24 -05:00
Nicolai Hähnle	bfcefcb3c7	radeonsi: Lower TGSI_OPCODE_STORE down to LLVM op Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:24 -05:00
Nicolai Hähnle	1e82dedeca	radeonsi: Lower TGSI_OPCODE_LOAD down to LLVM op (v3) v2: new signature style for buffer intrinsics (offsets) v3: new signature style for llvm.amdgcn.buffer.load.format (overloaded return) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2016-03-21 15:34:24 -05:00
Nicolai Hähnle	136686a51d	radeonsi: extract the LLVM type name construction into its own function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	02bd0cd7b1	radeonsi: Lower TGSI_OPCODE_RESQ down to LLVM op Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	75539197c7	radeonsi: extract TXQ buffer size computation into its own function This will allow it to be reused for RESQ. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	515fb2c09c	radeonsi: decompress shader images Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	f61566b77a	radeonsi: update shader image descriptor for invalidated buffer Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	e85cf35a65	radeonsi: implement set_shader_images (v2) Whether DCC is disabled depends on the access flags with which the image is bound: image_load supports DCC, but store and atomic don't. v2: remove an unnecessary masking of images->desc.enabled_mask Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	b1b7268f01	gallium/radeon: make r600_texture_disable_dcc externally accessible We will need it in radeonsi for shader images. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:22 -05:00
Nicolai Hähnle	457f9c6b25	tgsi/scan: track which shader images are really buffers Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:22 -05:00
Nicolai Hähnle	fa096a14af	tgsi/scan: add images_writemask Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:22 -05:00
Nicolai Hähnle	1379544081	st/mesa: translate additional flags in MemoryBarrier Re-order flags in the order in which they appear in the OpenGL spec in the description of MemoryBarrier(). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:22 -05:00
Nicolai Hähnle	96cd908fd3	gallium: add additional PIPE_BARRIER_* bits Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:22 -05:00
Brian Paul	86caa67aef	svga: add svga_winsys_context::pipe_debug_callback pointer The svga winsys modules can use this to send debug messages to the state tracker and Mesa. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-03-21 13:37:40 -06:00
Charmaine Lee	f8aaf0094d	svga: Fix the index buffer rebind regression The index buffer handle saved in the hw_state structure could be invalid after the index buffer is destroyed. Instead of rebinding the index buffer using the saved index buffer handle, we will reset the index buffer handle in the hw_state structure to force resending of the index buffer. Fixes bug 1593320 Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-21 13:37:40 -06:00
Charmaine Lee	47856e5945	svga: rebind stream output targets To ensure stream output target surfaces are available for the draw commands, we need to rebind the current stream output targets at the first draw in the command buffer. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-21 13:37:40 -06:00
Charmaine Lee	47cfc83440	svga: rebind index buffer Similar to other resources, current index buffer needs to be rebound at the first draw of the current command buffer to make sure the buffer is available for the draw command. Fixes bug 1587263. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-21 13:37:40 -06:00
Brian Paul	299f8ca0a7	svga: minor formatting fix, comment addition To sync with our internal tree. Signed-off-by: Brian Paul <brianp@vmware.com>	2016-03-21 13:37:25 -06:00
Charmaine Lee	b45b47c5c9	svga: optimize constant buffer uploads When a constant buffer slot is allocated in the upload buffer, the allocated slot size is always in multiple of 256. But the actual buffer size might not be in multiple of 256. This causes a gap between the ending offset of a slot and the starting offset of the next slot. The gap will prevent the two slots to be updated in a single update command. In order to maximize the chance of merging the contiguous dirty ranges, when a slot is to be allocated in the constant upload buffer, specify a buffer size in multiple of 256. There is about 10% performance improvement with Lightsmark2008 and 30% with Cinebench R11. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2016-03-21 12:58:25 -06:00
Charmaine Lee	0a1d91ef97	svga: add a few more resource updates HUD query This patch adds the following HUD queries: .num-resource-updates -- number of resource update. Commands include UPDATE_SUBRESOURCE, UPDATE_GB_IMAGE. .num-buffer-uploads -- number of buffer uploads. .num-const-buf-updates -- number of set constant buffer. .num-const-updates -- number of set shader constant. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2016-03-21 12:58:25 -06:00
Charmaine Lee	79e343b36a	svga: add new num-readbacks HUD query To find out how many image readback command is issued. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2016-03-21 12:58:25 -06:00
Brian Paul	dc9ecf58c0	svga: use shader sampler view declarations Previously, we looked at the bound textures (via the pipe_sampler_views) to determine texture dimensions (1D/2D/3D/etc) and datatype (float vs. int). But this could fail in out of memory conditions. If we failed to allocate a texture and didn't create a pipe_sampler_view, we'd default to using 0 (PIPE_BUFFER) as the texture type. This led to device errors because of inconsistent shader code. This change relies on all TGSI shaders having an SVIEW declaration for each SAMP declaration. The previous patch series does that. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	b56b853ab3	gallium/tests: declare sampler views in shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	38e831ca3d	gallium/util: declare sampler view in util_make_fs_blit_msaa_depthstencil() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	e7b5a844e3	postprocess: declare sampler views in shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	5a9f2a2d89	hud: add sampler view declaration in text fragment shader Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	b3daaefadb	st/mesa: emit sampler view decls in drawpixels code v2: support both TGSI_TEXTURE_2D and _RECT Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	0f0a23d4d8	st/mesa: emit sampler view declaration in bitmap shader In June 2015, Rob Clark started updating the tgsi utility code to emit SVIEW declarations in various shaders (for polygon stipple, blitting, etc). These patches do the same for the Mesa state tracker. The VMware driver will use this. v2: support both TGSI_TEXTURE_2D and _RECT Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	72eb5a3cfe	st/mesa: emit sampler view declarations for ARB vert/frag programs Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	eda81fa357	st/mesa: use correct TGSI texture target in drawpix fragment shader Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	83b5b3d66e	st/mesa: use correct TGSI texture target in bitmap fragment shader Depending on the driver's support for NPOT textures, we might use a RECT texture instead of 2D texture. We should propogate that info to the fragment shader's TEX instruction. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	63e020d734	gallium/tgsi: pass TGSI tex target to tgsi_transform_tex_inst() Instead of hard-coded 2D tex target in tgsi_transform_tex_2d_inst() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Nicolai Hähnle	a8b315b827	st/mesa: use the texture view's format for render-to-texture Aside from the bug below, it fixes a simplistic test I've written locally, and I see no regression in Piglit for radeonsi. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94595 Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 11:28:38 -05:00
Hans de Goede	dcf8a4d281	gallium: Remove unused TGSI_RESOURCE_ defines These magic file-index defines where only ever used in the nouveau code and that no longer uses them. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2016-03-21 12:20:58 +01:00
Hans de Goede	9b4c8f6629	nouveau: codegen: Do not silently fail in handeLOAD / handleSTORE / handleATOM handeLOAD / handleSTORE / handleATOM can only handle TGSI_FILE_BUFFER and TGSI_FILE_MEMORY. Make things fail explictly when another register-file is used in these functions. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)	2016-03-21 12:20:48 +01:00
Hans de Goede	86e4440361	nouveau: codegen: Disable more old resource handling code Commit `c3083c7082` ("nv50/ir: add support for BUFFER accesses") disabled / commented out some of the old resource handling code, but not all of it. Effectively all of it is dead already, if we ever enter the old code paths in handeLOAD / handleSTORE / handleATOM we will get an exception due to trying to access the now always zero-sized resources vector. Disable all the dead code. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)	2016-03-21 12:20:40 +01:00
Hans de Goede	71e315475c	nouveau: codegen: gk110: Make emitSTORE offset handling identical to emitLOAD Make the store offset handling in CodeEmitterGK110::emitSTORE identical to the one in CodeEmitterGK110::emitLOAD handling. This is just a cleanup, it does not cause any functional changes. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-21 12:20:38 +01:00
Hans de Goede	c783ad0e24	nouveau: codegen: Slightly refactor Source::scanInstruction() dst handling Use the dst temp variable which was used in the TGSI_FILE_OUTPUT case everywhere. This makes the code somewhat easier to reads and helps avoiding going over 80 chars with upcoming changes. This also brings the dst handling more in line with the src handling. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-21 12:20:32 +01:00
Hans de Goede	54cdde5eff	nouveau: codegen: Add support for clover / OpenCL kernel input parameters Add support for clover / OpenCL kernel input parameters. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)	2016-03-21 12:20:28 +01:00
Hans de Goede	3788e1bf74	tgsi: Add support for global / private / input MEMORY Extend the MEMORY file support to differentiate between global, private and shared memory, as well as "input" memory. "MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a special memory type is added for this, since the actual storage of these (e.g. UBO-s) may differ per implementation. The uploading of kernel parameters is handled by launch_grid, "MEMORY[x], INPUT" allows drivers to use an access mechanism for parameter reads which matches with the upload method. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)	2016-03-21 12:20:24 +01:00
Hans de Goede	43ddec2f43	tgsi: Fix decl.Atomic and .Shared not propagating when parsing tgsi text When support for decl.Atomic and .Shared was added, tgsi_build_declaration was not updated to propagate these properly. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)	2016-03-21 12:20:19 +01:00
Iago Toral Quiroga	8f45691cda	doc: document spilling options accepted by INTEL_DEBUG Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-21 08:16:49 +01:00
Hans de Goede	b72156c8e0	tgsi: Fix return of uninitialized memory in tgsi_*_instruction_memory tgsi_default_instruction_memory / tgsi_build_instruction_memory were returning uninitialized memory for tgsi_instruction_memory.Texture and tgsi_instruction_memory.Format. Note 0 means not set, and thus is a correct default initializer for these. Fixes: `3243b6fc97` ("tgsi: add Texture and Format to tgsi_instruction_memory") Cc: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-20 18:01:53 -04:00
Ilia Mirkin	bbbdcdcf75	st/mesa: report correct precision information for low/medium/high ints When we have native integers, these have full precision. Whether they're low/medium/high isn't piped through the TGSI yet, but eventually those might have differing precisions. For now they're just 32-bit ints. Fixes the following dEQP tests: dEQP-GLES3.functional.state_query.shader.precision_vertex_highp_int dEQP-GLES3.functional.state_query.shader.precision_fragment_highp_int which expected highp ints to have full 32-bit precision, not the default 23-bit float precision. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-20 17:51:08 -04:00
Nishanth Peethambaran	eeb117a09d	st/omx/dec: Correct the timestamping Attach the timestamp to the dpb buffer and use that timestamp while pushing buffer from dpb list to the omx client. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nishanth Peethambaran <nishanth.peethambaran@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-20 15:01:28 -04:00
Nishanth Peethambaran	46de6bbb77	st/omx: Remove trailing spaces Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nishanth Peethambaran <nishanth.peethambaran@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-20 15:01:28 -04:00
Ilia Mirkin	7d98bfedd7	nv50/ir: fix indirect texturing for non-array textures on nvc0 If a layer parameter is provided, we want to flip it to position 0 (and combine it with any indirect params). However if the target is not an array, there is no layer, so we have to shift all of the arguments down by one to make room for it. This fixes situations where there were non-coordinate parameters, such as bias, lod, depth compare, explicit derivatives. Instead of adding a new parameter at the front for the indirect reference, we would swap one of those in its place. Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler.uniform.compute.*shadow Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-20 14:14:32 -04:00
Ilia Mirkin	adb40a7399	st/mesa: only minify depth for 3d targets We make sure that that image depth matches the level's depth before copying it into place. However we should only be minifying the first level's depth for 3d textures - array textures have the same depth for all levels. This fixes tests such as dEQP-GLES3.functional.texture.specification.texsubimage3d_depth.* and I suspect account for a number of other odd situations I've run into where level > 0 of array textures was messed up. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-20 14:14:32 -04:00
Ilia Mirkin	6eeb284e4f	nv50/ir: normalize cube coordinates after derivatives have been computed In "manual" derivative mode (always used on nv50 and sometimes on nvc0 but always for cube), the idea is that using the quadop instruction, we set up the "other" quads to have values such that the derivatives work out, and then run the texture instruction as if nothing were strange. It pulls values from the other lanes, and does its magic. However cube coordinates have to be normalized - one of the 3 coords has to be 1, to determine which is the major axis, to say which face is being sampled. We were normalizing the coordinates first, and then adding the derivatives. This is wrong for two reasons: - the coordinates got normalized by a scaling factor but the derivatives didn't - the result of the addition didn't end up normalized To resolve this, we flip the logic around to normalize after the per-lane coordinates are set up. This fixes a bunch of textureGrad cube dEQP tests. NOTE: nv50 cube arrays with explicit derivatives are still broken, to be resolved at a later date. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-20 14:14:32 -04:00
Marek Olšák	ea2bff1d11	gallium/radeon: remove remnants of R600 TGSI->LLVM Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-20 00:57:05 +01:00
Marek Olšák	4e5dc69af1	r600g: flatten if (1) statement after removal of TGSI->LLVM Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-20 00:57:05 +01:00
Marek Olšák	20a09897a6	r600g: remove TGSI->LLVM translation It was useful for testing and as a prototype for radeonsi bringup, but it's not used anymore and doesn't support OpenGL 3.3 even. v2: try to fix OpenCL build Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-03-20 00:57:02 +01:00
Marek Olšák	8140154ae9	gallium/radeon: remove old CS tracing Cons: - it was only integrated in r600g - it doesn't work with GPUVM - it records buffer contents at the end of IBs instead of at the beginning, so the replay isn't exact - it lacks an IB parser and user-friendliness A better solution is apitrace in combination with gallium/ddebug, which has a complete IB parser and can pinpoint hanging CP packets. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-20 00:56:35 +01:00
Marek Olšák	a73a657def	radeonsi: process TGSI property NEXT_SHADER This allows compiling the main shader part as ES or LS. If we get the correct hint, non-separable GLSL shaders no longer have to be compiled as VS first, followed by LS or ES compiled on demand. The result is that fewer shaders are compiled by piglit, but it doesn't improve piglit running time. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-19 23:20:01 +01:00
Marek Olšák	2bdd7a46a9	st/mesa: set TGSI property NEXT_SHADER Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-19 23:20:01 +01:00
Marek Olšák	fbe6e92899	gallium: add TGSI property NEXT_SHADER Radeonsi needs to know which shader stage will execute after a shader in order to make the best decision about which shader variant to compile first. This is only set for VS and TES, because we don't need it elsewhere. VS has 3 variants: - next shader is FS - next shader is GS - next shader is TCS TES has 2 variants: - next shader is FS - next shader is GS Currently, radeonsi always assumes the next shader is FS, which is suboptimal, since st/mesa always knows which shader is next if the GLSL program is not a "separate shader". By default, ureg always sets "next shader is FS". Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-19 23:20:01 +01:00
Pierre Moreau	9184d9a0bb	nvc0/ir: Use double constant in handleSQRT Fixes: `a100d89d09` (nv50,nvc0: Fix invalid constant.) Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-19 15:59:52 -04:00
Kenneth Graunke	789e096594	mesa: Disallow GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME on winsys FBO. Fixes: dEQP-GLES3.functional.negative_api.state.get_framebuffer_attachment_parameteriv Apparently, GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME is not allowed when GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is GL_FRAMEBUFFER_DEFAULT, and is expected to result in a GL_INVALID_ENUM error. No GL specification actually defines what GL_FRAMEBUFFER_DEFAULT means. It probably means the window system FBO. It also doesn't mention the behavior of any queries for that type. Various ARB folks seem fairly confused about it too. For now, just do something vaguely like what dEQP expects. I think we probably need to check the visual bits against 0 for the attachment, but we haven't been doing that thusfar, and given how confusingly this is specified, I can't imagine anyone relying on it. v2: Improve comments, move error condition above the _mesa_get_fb0_attachment call, add forgotten "return" (all suggested/caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-19 12:58:15 -07:00
Ilia Mirkin	d2445b0083	nv50/ir: force-enable derivatives on TXD ops This matters especially in vertex shaders, where derivatives are disabled by default. This fixes textureGrad in vertex shaders on nv50. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-19 13:09:49 -04:00
Ilia Mirkin	d1b85dbffa	nv50: reset TFB bufctx when we no longer hold a reference to the buffers This fix is analogous to commit `ff085d014`. This fixes some use-after-free situations in dEQP when an xfb state is removed, and then a clear is triggered, which only does a partial validation. It would attempt to read the no-longer-valid buffers, resulting in crashes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-19 13:09:49 -04:00
Samuel Pitoiset	902bbda81b	nvc0: avoid using magic numbers for the uniform_bo offsets Instead make use of constants to improve readability. The first 32 bytes of the driver constant buffer are unknown... This doesn't seem to be used in the codegen part, but if the texBindBase offset is shifted from 0x20 to 0x00, this breaks the universe for really weird reasons. This sounds like to be related to textures. Anyway, name this NVC0_CB_AUX_UNK_INFO and add a todo should be enough for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-19 18:01:08 +01:00
Samuel Pitoiset	26cc411db8	nv50/ir: make use of auxCBSlot instead of magic numbers This avoids using magic numbers for the driver constbuf slot which is always 15 except for compute shaders on gk104+ where the slot 0 is used. For gk104+, some special compute-related values like the thread index are uploaded to screen->parm which is currently bound on c0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 18:01:04 +01:00
Samuel Pitoiset	d86933e6f4	nv50,nvc0: replace resInfoCBSlot by auxCBSlot Having two different variables for the driver constant buffer slot is confusing and really useless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 18:00:59 +01:00
Samuel Pitoiset	e05492fd7f	nv50/ir: fix compilation warning in handleSharedATOM() In release build mode only, op may be used uninitialized because the assertion has been removed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-19 17:01:17 +01:00
Vinson Lee	a100d89d09	nv50,nvc0: Fix invalid constant. Fix clang build error. CXX codegen/nv50_ir_lowering_nvc0.lo codegen/nv50_ir_lowering_nvc0.cpp:1783:42: error: invalid suffix 'd' on floating constant Value *zero = bld.loadImm(NULL, 0.0d); ^ Fixes: `c1e4a6bfbf` ("nv50,nvc0: handle SQRT lowering inside the driver") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-18 20:38:41 -07:00
Kenneth Graunke	46610238e0	mesa: Do proper format error checks for GenerateMipmap in ES 3.x. According to the OpenGL ES 3.2 spec's description of GenerateMipmap: "An INVALID_OPERATION error is generated if the levelbase array was not specified with an unsized internal format from table 8.3 or a sized internal format that is both color-renderable and texture-filterable according to table 8.10." Similar text exists in the ES 3.0 specification as well. Our existing rules are pretty close, but miss a few things. The OpenGL specification actually doesn't have any text about internal format checking - our existing code comes from a Khronos bug report. The ES 3.x spec provides a clearer description. Fixes dEQP-GLES3.functional.negative_api.texture.generatemipmap and dEQP-GLES2.functional.negative_api.texture.generatemipmap_zero_level _array_compressed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:43:47 -07:00
Kenneth Graunke	f1b0573510	mesa: Add color renderable/texture filterable format info for ES 3.x. OpenGL ES 3.x contains a table of sized internal formats and their required properties. In particular, each format is marked as "Color Renderable" or "Texture Filterable". This patch introduces two functions that can be used to query the information from that table. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:43:23 -07:00
Kenneth Graunke	88d28aa4d9	i965: Stop XY clipping point and line primitives. Wide points and lines are not supposed to be clipped by the viewport. Rather, they should be rendered, and any fragments outside of the viewport should be discarded. The traditional use case for this behavior is rendering moving wide point particles. When the center of the point approaches the viewport edge, clipping would make it pop out of view early. Fixes: - dEQP-GLES2.functional.clipping.point.wide_point_clip - dEQP-GLES3.functional.clipping.point.wide_point_clip - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_center - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_corner Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:42:51 -07:00
Kenneth Graunke	0de64ab788	i965: Scissor to the viewport when rendering points/lines. We're about to start allowing wide points/lines whose vertices are outside the viewport past the clipper. This scissoring hack ensures that any fragments generated are still restricted to the viewport. It is not necessary on Gen8+ as those platforms already discard fragments which are outside the viewport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:42:30 -07:00
Kenneth Graunke	d000a4989f	i965: Include the viewport in the scissor rectangle. We'll need to use scissoring to restrict fragments to the viewport soon. It seems harmless to include it generally, so let's do that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:42:15 -07:00
Kenneth Graunke	47be5a64c7	i965: Introduce an is_drawing_lines() helper. Similar to is_drawing_points(). v2: Account for isoline tessellation output topology. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:41:59 -07:00
Kenneth Graunke	757674e8d0	i965: Move is_drawing_points to brw_state.h. I need to use this in multiple source files. v2: Rebase on TES output domain fix. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:41:25 -07:00
Jason Ekstrand	ecfb074276	anv/allocator: Make the bo_pool dynamically sized	2016-03-18 17:25:58 -07:00
Kenneth Graunke	5b2d8c2273	i965: Fix gl_TessLevelOuter[] for isolines. Thanks to James Legg for finding this! From the ARB_tessellation_shader spec: "The number of isolines generated is derived from the first outer tessellation level; the number of segments in each isoline is derived from the second outer tessellation level." According to the PRM, "TF.LineDensity determines # lines" while "TF.LineDetail determines # segments". Line Density is stored at DWord 6, while Line Detail is at DWord 7. So, they're not reversed like they are for triangles and quads. Fixes Piglit's spec/arb_tessellation_shader/execution/isoline, and about 24 dEQP isoline tests (with GL_EXT_tessellation_shader hacked on - it's not normally enabled). Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94524 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 16:45:23 -07:00
Kenneth Graunke	24298b7e2f	i965: Decode non-normalized coordinates bit in SAMPLER_STATE. We weren't printing this for some reason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-18 16:44:51 -07:00
Kenneth Graunke	8679d40dc7	i965: Account for TES in is_drawing_points(). Now that we implement tessellation shaders, the TES might be the last stage enabled. If it's outputting points, then the primitive type reaching the SF is points. We need to account for this. Caught by Ilia Mirkin. v2: Update dirty bit comment above caller (caught by Iago) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-18 16:44:15 -07:00
Pierre Moreau	1282146d4e	nv50: Mark compute states as dirty on context switch Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> [ Samuel Pitoiset: Trivial rebase conflict ] Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-19 00:18:00 +01:00
Samuel Pitoiset	a734c0f8ba	nv50/ir: print SUBFM subops Only 3d subop is currently emitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-19 00:09:18 +01:00
Samuel Pitoiset	af0c97fb90	nv50: add a new validation path for compute This makes use of the new state validation interface to be consistent with 3d. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:09:14 +01:00
Samuel Pitoiset	5ed387675d	nv50: rework nv50_compute_validate_program() Reduce the amount of duplicated code by re-using nv50_program_validate(). While we are at it, change the prototype to return void. We don't check anymore if the translation fails but improving the state validation is a long process. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:09:09 +01:00
Samuel Pitoiset	a07ebc1993	nv50: rework the validation path for 3D This exposes an interface for state validation that will be also used to rework the compute validation path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:09:05 +01:00
Samuel Pitoiset	517d2c97e1	nv50: rename 3d binding points to NV50_BIND_3D_XXX Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:09:02 +01:00
Samuel Pitoiset	9374fc1e67	nv50: rename 3d dirty flags to NV50_NEW_3D_XXX Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:08:56 +01:00
Samuel Pitoiset	e844aac40b	nv50: rename NV50_COMPUTE to NV50_CP Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:08:52 +01:00
Samuel Pitoiset	dedb46f582	nv50: rename nv50_context::dirty to nv50_context::dirty_3d Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:08:28 +01:00
Jason Ekstrand	b1c5d45872	anv/allocator: Add a size field to bo_pool_alloc	2016-03-18 11:50:53 -07:00
Brian Paul	9211b68ad3	st/mesa: clean up st_translate_texture_target() Reformat code. Improve assertion. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-18 12:06:31 -06:00
Brian Paul	0f73c3ab25	st/mesa: simplify drawpixels shader code with tgsi transform helper functions Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-18 12:06:30 -06:00
Brian Paul	373910f4e7	st/mesa: simplify bitmap shader code with tgsi transform helper functions Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-18 12:06:30 -06:00
Brian Paul	e9d5e68d1b	tgsi: add tgsi_transform_op3_inst() function Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-18 12:06:30 -06:00
Juan A. Suarez Romero	7a712e64d6	doc: add 'vec4' option in INTEL_DEBUG Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-18 17:30:56 +01:00
Daniel Czarnowski	d4714512e4	egl: support EGL_LARGEST_PBUFFER in eglCreatePbufferSurface(...) Patch provides a default for a set pbuffer surface size when EGL_LARGEST_PBUFFER is used by the client. MIN2 macro is moved to egldefines so that it can be shared. Fixes following Piglit test: egl-create-largest-pbuffer-surface From EGL 1.5 spec: "Use EGL_LARGEST_PBUFFER to get the largest available pbuffer when the allocation of the pbuffer would otherwise fail." Currently there exists no API to query largest available pixmap size using xlib or xcb so right now this seems most straightforward way to ensure that we fulfill above API and also we don't attempt to allocate 'too big' pixmap which might succeed on server side but not work in practice when driver starts to use it as a texture. v2: add more explanation about the change (Emil) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-18 07:35:32 +02:00
George Kyriazis	dd63fa28f1	gallium/swr: Cleaned up some context-resource management Removed bound_to_context. We now pick up the context from the screen instead of the resource itself. The resource could be out-of-date and point to a pipe that is already freed. Fixes manywin mesa xdemo. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-03-17 20:57:52 -05:00
Timothy Arceri	952c166170	mesa: remove remaining tabs in prog_parameter.c Acked-by: Matt Turner <mattst88@gmail.com>	2016-03-18 12:42:53 +11:00
Timothy Arceri	ce9c042ab3	mesa: inline _mesa_add_unnamed_constant() Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-18 12:42:43 +11:00
Timothy Arceri	fa9bd6b663	mesa: simplify and inline _mesa_lookup_parameter_index() The function has only one user and strings are always null terminated. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-18 12:42:39 +11:00
Timothy Arceri	350b1ef027	mesa: make _mesa_lookup_parameter_constant static This is not used outside of prog_parameter.c Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-18 12:42:34 +11:00
Timothy Arceri	7794b22a84	mesa: remove unused function Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-18 12:42:30 +11:00
Nicolai Hähnle	a8eea696b8	st/mesa: honour sized internal formats in st_choose_format (v2) The bitcasting which is possible with shader images (and texture views?) requires that when the user specifies a sized internal format for a texture, we really allocate that format. To this end: (1) find_exact_format should ignore sized internal formats and (2) some of the entries in the mapping table corresponding to sized internal formats are reordered to use an RGBA format instead of a BGRA one. This fixes arb_shader_image_load_store-bitcast in the (work in progress) ARB_shader_image_load_store implementation for radeonsi. v2: don't change the mapping of GL_RGB10: the change caused a regression because it preferred a format with an alpha channel, and GL_RGB10 is not among the supported formats for shader images Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-17 19:26:40 -05:00
Dongwon Kim	49eb5e75bd	configure.ac: enable_asm=yes when x-compiling across same X86 arch Currently, configure script is forcing 'enable_asm' to be 'no' whenever cross-compilation is performed on X86 host. This is based on an assumption that target architecture is different from host's (i.e. ARM). But there's always a case that we do cross-compilation for target that is also X86 based just like host in which same ASM codes will be supported. 'enable_asm' should not be forced to be "no" anymore in this case. v2: corrected commit message Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>	2016-03-17 16:53:23 -07:00
Timothy Arceri	d6b9202873	glsl: disable varying packing when its not safe In GL 4.4+ there is no guarantee that interpolation qualifiers will match between stages so we cannot safely pack varyings using the current packing pass in Mesa. We also disable packing on outerward facing interfaces for SSO because in ES we need to retain the unpacked varying information for draw time validation. For desktop GL we could allow packing for SSO in versions < 4.4 but its just safer not to do so. We do however enable packing on individual arrays, structs, and matrices as these are required by the transform feedback code and it is still safe to do so. Finally we also enable packing when a varying is only used for transform feedback and its not a SSO. This fixes all remaining rendering issues with the dEQP SSO tests, the only issues remaining with thoses tests are to do with validation. Note: There is still one remaining SSO bug that this patch doesn't fix. Their is a chance that VS -> TCS will have mismatching interfaces because we pack VS output in case its used by transform feedback but don't pack TCS input for performance reasons. This patch will make the situation better but doesn't fix it. V4: fix out of order function params after rebase, make sure packing still disabled in tess stages. Update comments as to why we disable packing on SSO. V3: ES 3.1 does require interpolation to match so don't disable packing there. Rebased on master rather than on enhanced layouts component packing series. V2: Make is_varying_packing_safe() a function in the varying_matches class, fix spelling (Matt) and make sure to remove the outer array when dealing with Geom and Tess shaders where appropriate. Lastly fix piglit regression in new piglit test and document the undefined behaviour it depends on: arb_separate_shader_objects/execution/vs-gs-linking.shader_test Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-18 10:26:34 +11:00
Timothy Arceri	c0ae6eeb3b	glsl: pass disable_varying_packing bool to the lowering pass This will allow us to choose to ignore the disable which will be useful for more fine grained control over when to enable or disable packing. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-18 10:26:30 +11:00
Marek Olšák	4ab2ac3349	radeonsi: fix Hyper-Z hangs on P2 configs Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-17 18:30:45 +01:00
Romain Failliot	151724159d	docs: Renormalize older extensions. For older extensions, there is an explanation first and the extension name in brackets, like that: Clamping controls (GL_ARB_color_buffer_float) I inverted that so we have the extension first and then the explanation in brackets, like that: GL_ARB_color_buffer_float (Clamping controls) It will help me later to parse the few extensions that use this syntax: all drivers that support <GL_extension> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-17 11:35:20 -05:00
Romain Failliot	f5d47dd428	docs: Renormalize some extensions. This fixes some exceptions I have to deal with in mesamatrix.net. The extensions GL_ARB_texture_buffer_object had a comment between "DONE" and the brackets. And the extension GL_KHR_robustness (in GL 4.5 and GLES 3.1) was using "90% done" instead of "in progress". The "90% done" is still here though, but as an extension comment. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-17 11:35:12 -05:00
Romain Failliot	3671bb3eaf	docs: Realign the "Status" column. The "Status" column was misaligned in some GL sections. This is a lot of diffs, but it's only spaces in the end. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-17 11:35:09 -05:00
Romain Failliot	e571f11de8	docs: howto to read and edit GL3.txt Added a small guide on how to read and edit GL3.txt. I think this would help as much the devs as the users reading this file. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-17 11:34:50 -05:00
Brian Paul	84b961dd53	r300g: add missing layer argument to rws->buffer_get_handle() call Fixes compilation error since `5aea0d691`. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-03-17 09:52:21 -06:00
Christian König	5aea0d6919	radeon/winsys: add layer support for BO export Add layer support to export individual array layers. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-17 14:17:06 +01:00
Christian König	04bc082f6a	radeon/winsys: add offset support for BO import/export Add offset support to handle NV12 offsets as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-17 14:17:03 +01:00
Christian König	f1e78a48f2	gallium/winsys/drm: add layer to struct winsys_handle For exporting a specific layer of an array texture. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-17 14:16:59 +01:00
Christian König	29d26f1522	gallium/winsys/drm: add offset to struct winsys_handle We are going to need this for EGL_EXT_image_dma_buf_import. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-17 14:16:03 +01:00
Connor Abbott	58fe7837b8	nir: propagate bitsize information in nir_search When we replace an expresion we have to compute bitsize information for the replacement. We do this in two passes to validate that bitsize information is consistent and correct: first we propagate bitsize from child nodes to parent, then we do it the other way around, starting from the original's instruction destination bitsize. v2 (Iago): - Always use nir_type_bool32 instead of nir_type_bool when generating algebraic optimizations. Before we used nir_type_bool32 with constants and nir_type_bool with variables. - Fix bool comparisons in nir_search.c to account for bitsized types. v3 (Sam): - Unpack the double constant value as unsigned long long (8 bytes) in nir_algrebraic.py. v4 (Sam): - Use helpers to get type size and base type from nir_alu_type. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:54:45 +01:00
Connor Abbott	3124ce699b	nir: add a bit_size parameter to nir_ssa_dest_init v2: Squash multiple commits addressing the new parameter in different files so we don't break the build (Iago) v3: Fix tgsi (Samuel) v4: Fix nir_clone.c (Samuel) v5: Fix vc4 and freedreno (Iago) v6 (Sam) - Fix build errors in nir_lower_indirect_derefs - Use helper to get type size from nir_alu_type. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:54:45 +01:00
Iago Toral Quiroga	084b24f558	nir: rename nir_const_value fields to include bitsize information Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-17 11:16:33 +01:00
Connor Abbott	9076c4e289	nir: update opcode definitions for different bit sizes Some opcodes need explicit bitsizes, and sometimes we need to use the double version when constant folding. v2: fix output type for u2f (Iago) v3: do not change vecN opcodes to be float. The next commit will add infrastructure to enable 64-bit integer constant folding so this is isn't really necessary. Also, that created problems with source modifiers in some cases (Iago) v4 (Jason): - do not change bcsel to work in terms of floats - leave ldexp generic Squashed changes to handle different bit sizes when constant folding since otherwise we would break the build. v2: - Use the bit-size information from the opcode information if defined (Iago) - Use helpers to get type size and base type of nir_alu_type enum (Sam) - Do not fallback to sized types to guess bit-size information. (Jason) Squashed changes in i965 and gallium/nir drivers to support sized types. These functions should only see sized types, but we can't make that change until we make sure that nir uses the sized versions in all the relevant places. A later commit will address this. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Connor Abbott	6700d7e423	nir: add nir_{src,dest}_bit_size() helpers v2: use a ternary (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jason Ekstrand	e172dbe5d2	nir: Add a bit_size to nir_register and nir_ssa_def This really hacky commit adds a bit size to registers and SSA values. It also adds rules in the validator to validate that they do the right things. It's still an open question as to whether or not we want a bit_size in nir_alu_instr or if we just want to let it inherit from the destination. I'm inclined to just let it inherit from the destination. A similar question needs to be asked about intrinsics. v2 (Connor): - Relax validation: comparisons have explicit destination sizes and implicit source sizes. v3 (Sam): - Use helpers to get size and base types of nir_alu_type enum. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Connor Abbott	3d37de930d	nir/types: add a function to get the bitsize of a base type v2: fix it for GLSL_TYPE_SUBROUTINE (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Samuel Iglesias Gonsálvez	c38a25af2f	i965/nir: fix check to resolve booleans to work with sized nir_alu_type As nir_alu_type has now embedded the data size, the check for the instruction's output type (to see if a boolean resolve is required) should ignore the data size part. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jason Ekstrand	78f1919429	nir: Add explicitly sized types v2: Fix size/type mask to properly handle 8-bit types. v3: Add helpers to get the bitsize and base type of a nir_alu_type enum. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jordan Justen	3fd308a357	Merge remote-tracking branch 'origin/master' into vulkan	2016-03-17 01:44:07 -07:00
Jordan Justen	7d021cb15e	i965/nir: Lower nir compute shader shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	b1e7cdfdcf	nir: Lower shared var atomics during nir_lower_io Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	e3cbb9d37c	nir: Add support for lowering load/stores of shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	683c359c54	nir: Add atomic operations on variables This allows us to first generate atomic operations for shared variables using these opcodes, and then later we can lower those to the shared atomics intrinsics with nir_lower_io. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	3c807607df	nir: Add compute shader shared variable storage class Previously we were receiving shared variable accesses via a lowered intrinsic function from glsl. This change allows us to send in variables instead. For example, when converting from SPIR-V. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	26f8262698	nir/print: Add space after shader_storage var mode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Iago Toral Quiroga	5be11d2236	i965: Skip execution size adjustment for instructions of width 4 This code in brw_set_dest adjusts the execution size of any instruction with a dst.width < 8. However, we don't want to do this with instructions operating on doubles, since these will have a width of 4, but still need an execution size of 8 (for SIMD8). Unfortunately, we can't just check the size of the operands involved to detect if we are doing an operation on doubles, because we can have instructions that do operations on double operands interpreted as UD, operating on any of its 2 32-bit components. Previous commits have made it so we never emit instructions with a horizontal width of 4 that don't have the correct execution size set for gen6+, so we can skip it in this case, avoiding the conflicts with fp64 requirements. Expanding the same fix to other hardware generations requires many more changes but since we are not targetting fp64 support on them wer don't really care for now. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez	22a10dd030	i965/vec4/gen6: fix exec_size for MOV with a width of 4 in generate_gs_ff_sync() Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez	b91b9e4b00	i965/vec4/gen6: fix exec_size for instructions with destination width of 4 Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez	30fc3fa24d	i965/vec4/gen6: fix exec_size for instructions with width of 4 in generate_gs_svb_write() Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez	2fafc6b98c	i965/gs/gen6: fix execsize for instructions with width of 4 in gen6_sol_program() v2: - Add assert (Topi). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Iago Toral Quiroga	f6342b5645	i965: set correct execsize for MOVS with a width of 4 in brw_find_live_channel Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Iago Toral Quiroga	31a8604252	i965/eu: set execution size for SEND message in brw_send_indirect_message Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Iago Toral Quiroga	2d6af62a0f	i965/fs: Set exec size for gen7 pull const loads v2 (Topi): - No need to set the execsize for the indirect send message, the next patch will handle that. - Set the execution size explicitly instead of taking it from the width of the dst that we set before. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:24 +01:00
Iago Toral Quiroga	ea45b6e96d	i965/eu: set correct execution size in brw_NOP v2: NOP should have an execsize of 1 (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:24 +01:00
Kenneth Graunke	9c1e01c4a8	meta: Don't use integer handles for shaders or programs. Previously, we gave our internal clear/blit shaders actual GL handles and stored them in the shader/program hash table. We used ordinary GL API entrypoints to work with them. We thought this shouldn't be a problem because GL doesn't allow applications to invent their own names for shaders or programs. GL allocates all names via glCreateShader and glCreateProgram. However, having them in the hash table is a bit risky: if a broken application guesses the name of our shaders or programs, it could alter them, potentially screwing up future meta operations. Also, test cases can observe the programs in the hash table. Running a single dEQP process that executes the following test list: dEQP-GLES3.functional.negative_api.buffer.clear dEQP-GLES3.functional.negative_api.shader.compile_shader dEQP-GLES3.functional.negative_api.shader.delete_shader would result in the last two tests breaking. The compile_shader test calls glCompileShader(9) straight away, and since it hasn't even created any shaders or programs, it expects to get a GL_INVALID_VALUE error because there's no such name. However, because the clear test ran first, it created Meta programs, so an object named "9" did exist. This patch reworks Meta to work with gl_shader and gl_shader_program pointers directly. These internal programs have bogus names, and are never stored in the hash tables, so they're invisible to applications. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94485 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-16 23:57:11 -07:00
Kenneth Graunke	0fe254168b	mesa: Expose compile_shader() and link_program() beyond the file. This will allow me to use them directly from Meta, bypassing the versions that work with GL integer handles. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-16 23:57:11 -07:00
Kenneth Graunke	7753657cf2	mesa: Make link_program() take a gl_shader_program, not a GLuint. In half the callers, we already have a pointer, and don't need to look it up again. This will also help with upcoming meta work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-16 23:57:11 -07:00
Kenneth Graunke	a461e0003f	mesa: Make compile_shader() take a gl_shader, not a GLuint. In half the callers, we already have a pointer, and don't need to look it up again. This will also help with upcoming meta work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-16 23:57:11 -07:00
Kenneth Graunke	a7e9b31d5b	meta: Use the _mesa_meta_compile_and_link_program helper more places. Less boilerplate. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-16 23:57:11 -07:00
Eric Anholt	2b9f0dffe0	vc4: Move discard handling to the condition flag. Now that the field exists in the instruction, we can make discards less special. As a bonus, that means that we should be able to merge some more .sf instructions together when we get around to that. This causes some scheduling changes, as it allows tlb_color_reads to be delayed past the discard condition setup. Since the tlb_color_read ends up later, this may mean performance improvements, but I haven't tested. total instructions in shared programs: 78114 -> 78035 (-0.10%) instructions in affected programs: 1922 -> 1843 (-4.11%) total estimated cycles in shared programs: 234318 -> 234329 (0.00%) estimated cycles in affected programs: 8200 -> 8211 (0.13%)	2016-03-16 11:28:47 -07:00
Eric Anholt	7c9fc43915	vc4: Don't make a temporary for setting flags. The register allocator doesn't really do anything about the temp, so it doesn't seem like it should matter. However, the scheduler would think that a new def is being created. This doesn't change anything yet, but it avoids a bunch of regressions in the next commit.	2016-03-16 11:28:34 -07:00
Eric Anholt	b4f45f319c	vc4: Add a safety check for setting flags. If a pack was on the src reg, should it be a float, int, or mul unpack? Just complain, instead.	2016-03-16 11:28:34 -07:00
Eric Anholt	a298fb15af	vc4: Reuse list_for_each_entry_safe_rev(). This didn't exist when I wrote the code.	2016-03-16 11:28:34 -07:00
Nanley Chery	5464f0c046	anv/blit: Reduce number of VUE headers being read Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:57:23 -07:00
Nanley Chery	f33866ae0a	anv/blit: Remove completed finishme for VkFilter This task was finished as of: `d9079648d0`. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:57:19 -07:00
Nanley Chery	5647de8ba5	anv/blit2d: Only use one extent in meta_emit_blit2d Since scaling isn't involved, we don't need multiple extents. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:57:14 -07:00
Nanley Chery	92fb65f117	anv/blit2d: Remove sampler from pipeline Since we're using texelFetch with a sampled image, a sampler is no longer needed. This agrees with the Vulkan Spec section 13.2.4 Descriptor Set Updates: sampler is a sampler handle, and is used in descriptor updates for types VK_DESCRIPTOR_TYPE_SAMPLER and VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER if the binding being updated does not use immutable samplers. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:57:00 -07:00
Nanley Chery	f8f9886915	anv/blit2d: Use texel fetch in frag shader The texelFetch operation requires that the sampled texture coordinates be unnormalized integers. This will simplify the copy shader for w-tiled images (stencil buffers). v2 (Jason): Use f2i for texel coords Fix num_components indirectly Use float inputs for interpolation Nest tex_pos functions Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:56:51 -07:00
Nanley Chery	b487acc489	Revert "anv/meta: Make meta_emit_blit() public" This reverts commit `f391683922`. Some conflicts had to be resolved in order for this revert to be successful. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:56:46 -07:00
Nanley Chery	1a0c63b880	Revert "anv/meta: Prefix anv_ to meta_emit_blit()" This reverts commit `514c055717`. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:56:41 -07:00
Nanley Chery	997a873f0c	anv/blit2d: Customize meta blit structs and functions for blit2d API * Add fields in meta struct * Add support in meta init/teardown * Switch to custom meta_emit_blit2d() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:56:22 -07:00
Nanley Chery	2d8c632117	anv/blit2d: Copy anv_meta_blit.c functions These will be customized for blit2d operations. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:56:10 -07:00
Kenneth Graunke	b566317e7e	meta: Use ARB_explicit_attrib_location in the rest of the meta shaders. This is cleaner than using glBindAttribLocation(). Not all drivers support the extension, but I don't think those drivers use GLSL in the first place. Apparently some Meta shaders already use GL_ARB_explicit_attrib_location, so I think it should be okay. Honestly, I'm not sure how the old code worked anyway - we bound the attribute location for "texcoords", while all the shaders capitalized or spelled it differently. v2: Convert another instance in brw_meta_fast_clear.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-16 00:09:56 -07:00
Plamena Manolova	9d9965c06f	mesa: Ignore glPointSize when GL_POINT_SIZE_ARRAY_OES is enabled When a user defines a point size array and enables it, the point size value set via glPointSize should be ignored. To achieve this, we can simply toggle ctx->VertexProgram.PointSizeEnabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42187 Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-15 15:49:48 -07:00
Jason Ekstrand	abaa3bed22	anv/device: Flush the fence batch rather than the start of the BO	2016-03-15 15:24:24 -07:00
Jason Ekstrand	7f6a0cb29c	Merge remote-tracking branch 'public/master' into vulkan	2016-03-15 14:09:50 -07:00
Varad Gautam	e103b52aec	vc4: Coalesce instructions using VPM reads into the VPM read. This is done instead of copy propagating the VPM reads into the instructions using them, because VPM reads have to stay in order. shader-db results: total instructions in shared programs: 78509 -> 78114 (-0.50%) instructions in affected programs: 5203 -> 4808 (-7.59%) total estimated cycles in shared programs: 234670 -> 234318 (-0.15%) estimated cycles in affected programs: 5345 -> 4993 (-6.59%) Signed-off-by: Varad Gautam <varadgautam@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Rhys Kidd <rhyskidd@gmail.com>	2016-03-15 13:09:24 -07:00
Varad Gautam	00bdbb22a9	vc4: rename file to group vpm optimizations together This file will contain optimization passes for both vpm reads and writes. Signed-off-by: Varad Gautam <varadgautam@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-03-15 12:49:37 -07:00
Eric Anholt	1c4b077409	vc4: Fix failures with nir_extract_* since the addition of the opcodes.	2016-03-15 12:49:37 -07:00
Roland Scheidegger	bb2c5e657b	llvmpipe: fix lp_rast_plane alignment on 32bit Some rasterization code relies (for sse) on the first and third planes (but not the second for now) being 128bit aligned, and we didn't get that on 32bit - I mistakenly thought the 64bit number in the struct would get the thing aligned to 64bit even on 32bit archs. Stephane Marchesin really figured this out. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> CC: <mesa-stable@lists.freedesktop.org>	2016-03-15 19:42:15 +01:00
Roland Scheidegger	12a4f0bed6	draw: fix line stippling The logic was comparing actual ints, not true/false values. This meant that it was emitting always multiple line segments instead of just one even if the stipple test had the same result, which looks inefficient, and the segments also overlapped thus breaking line aa as well. (In practice, with the no-op default line stipple pattern, for a 10-pixel long line from 0-9 it was emitting 10 segments, with the individual segments ranging from 0-1, 0-2, 0-3 and so on.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94193 Reviewed-by: Jose Fonseca <jfonseca@vmware.com> CC: <mesa-stable@lists.freedesktop.org>	2016-03-15 19:41:34 +01:00
Roland Scheidegger	4b249ed4cd	softpipe: fix misleading TGSI_QUAD_SIZE usage All these img filter loops iterate through NUM_CHANNELS, not QUAD_SIZE. In practice both are of course the same unchangeable value (4), but it makes the code look a bit confusing. Moreover, some of the functions were actually given an array of 4 values according to the declaration, yet the code was addressing values 0/4/8/12 out of it, so fix this by just saying it's a pointer to floats like the other functions. While here, also add comment about not quite correct filtering. There's no actual code difference. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-15 19:37:59 +01:00
Roland Scheidegger	9e9d69979c	softpipe: fix anisotropic filtering crash The filt_args->offset wasn't assigned but was always used later leading to a crash (as far as I can tell, texel offsets don't actually make much sense with anisotropic filtering, but because there's no explicit setting if offsets are enabled there the array is always accessed). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94481 Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> CC: <mesa-stable@lists.freedesktop.org>	2016-03-15 16:40:05 +01:00
Nicolai Hähnle	4de25fa7b0	radeonsi: set DEPTH_BEFORE_SHADER based on FS_EARLY_DEPTH_STENCIL Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:59 -05:00
Nicolai Hähnle	0ffcc318e6	tgsi: add tgsi_full_src_register_from_dst helper function Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:49 -05:00
Nicolai Hähnle	c02d73af0b	gallium/u_inlines: add util_copy_image_view Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:46 -05:00
Nicolai Hähnle	f6dc4f5558	st/mesa: set image access flags in st_bind_images Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:43 -05:00
Nicolai Hähnle	71a1b54b33	gallium: add access field to pipe_image_view This allows drivers to make smarter decisions e.g. about whether the image has to be decompressed. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:40 -05:00
Nicolai Hähnle	8c497b8fb5	st/glsl_to_tgsi: set FS_EARLY_DEPTH_STENCIL when required Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:37 -05:00
Nicolai Hähnle	e526f930aa	tgsi: add TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:33 -05:00
Nicolai Hähnle	1c0cee8764	st/glsl_to_tgsi: set memory access type on image intrinsics This is required to preserve the image variable's coherent/restrict/volatile qualifiers in TGSI. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:30 -05:00
Nicolai Hähnle	dfcf420412	st/glsl_to_tgsi: provide Texture and Format information for image ops Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:26 -05:00
Nicolai Hähnle	3243b6fc97	tgsi: add Texture and Format to tgsi_instruction_memory Frontends should have this information readily available, and it simplifies image LOAD/STORE/ATOM* handling especially with indirect image access. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:02 -05:00
Nicolai Hähnle	9b68bdf6f8	get: reconcile aliasing enums for MaxCombinedShaderOutputResources The enums MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS and MAX_COMBINED_SHADER_OUTPUT_RESOURCES are equal and should therefore only appear once. Noticed while implementing ARB_shader_image_load_store without previously implementing SSBO. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-14 17:19:14 -05:00
Francisco Jerez	b054605722	i965/fs: Restrict inequality that can only hold equal in saturate propagation. Should have no functional change. The IP value of an instruction that reads src_var cannot possibly be after the end of the live interval of the variable it's reading from, by the definition of live interval. Might save future readers a momentary WTF while trying to understand this code. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-14 14:58:19 -07:00
Francisco Jerez	7d7990cf65	i965/vec4: Consider removal of no-op MOVs as progress during register coalesce. Bug found by the liveness analysis validation pass that will be introduced in a later commit. The no-op MOV check in opt_register_coalesce() was removing instructions which makes the cached liveness analysis calculation inconsistent with the shader IR. We were failing to set progress to true in that case though, which means that invalidate_live_intervals() wouldn't necessarily be called at the end of the function. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-14 14:58:11 -07:00
Francisco Jerez	93be4158ae	i965/fs: Add missing analysis invalidation in fixup_3src_null_dest(). Bug found by the liveness analysis validation pass that will be introduced in a later commit. fixup_3src_null_dest() was allocating registers which makes the cached liveness analysis calculation incomplete, so it must be invalidated. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-14 14:57:58 -07:00
Francisco Jerez	6691c03fd3	i965/fs: Add missing analysis invalidation in opt_sampler_eot(). Bug found by the liveness analysis validation pass that will be introduced in a later commit. opt_sampler_eot() was allocating registers and inserting and removing instructions, which makes the cached liveness analysis calculation inconsistent with the shader IR, so it must be invalidated. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-14 14:56:02 -07:00
Hans de Goede	4d02e91e49	clover: Fix pipe_grid_info.indirect not being initialized. After pipe_grid_info.indirect was introduced, clover was not modified to set it causing it to pass uninitialized memory for it to launch_grid. This commit fixes this by zero-ing the entire pipe_grid_info struct when declaring it, to avoid similar problems popping-up in the future. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> [ Francisco Jerez: Trivial codestyle fix. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-14 14:12:42 -07:00
Sarah Sharp	af06190760	mesa: docs: Intel i965 hardware limits. This should help the next person working on hardware enabling figure out where in the Intel PRMs to find the magic platform hardware values. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>	2016-03-14 14:00:29 -07:00
Sarah Sharp	0f5bfc7f01	mesa: docs: i965: Use correct doxygen groupings syntax When reading the source code, it's useful to indicate that a group of fields in a struct are related in someway. There were several places where people tried to group related structure members with the {@ syntax, without realizing they also needed to add the \name syntax in order to generate correct doxygen html. There are several files with groupings that look like this: struct foo { /** * Related fields description * @{ / int bar; char baz; /* @} / long qux; } However, the doxygen syntax for grouping is: struct foo { /* * \name Related fields description * @{ / int bar; char baz; /* @} */ long qux; } https://www.stack.nl/~dimitri/doxygen/manual/grouping.html Without the group name definition, the fields don't get properly grouped. Instead, the group description is applied to the first field. Fix the Intel hardware information structure, brw_device_info to properly group the GPU hardware limitations and hardware quirks fields. Once you've run `cd doxygen; make clean; make all`, updated documentation can be found at mesa/doxygen/i965/structbrw__device__info.html Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>	2016-03-14 14:00:29 -07:00
Bruce Cherniak	e9d68cc3da	gallium/swr: Resource management Better tracking of resource state and synchronization. A follow on commit will clean up resource functions into a new swr_resource.cpp file. Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2016-03-14 14:07:48 -05:00
Marek Olšák	7a2333e4ef	configure.ac: require libdrm 2.4.66 for drmGetDevice since `737b6ed13e` src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c no longer compiles: error: unknown type name ‘drmDevicePtr’	2016-03-14 16:42:41 +01:00
Francisco Jerez	63250d8178	i965: Remove useless IR self-destruct backend_shader method. From the point it's constructed the CFG contains the only existing copy of the program IR, and it never becomes invalid. Calling backend_shader::invalidate_cfg would have destroyed the program structure irrecoverably -- We weren't calling it at all for a good reason. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-13 18:07:53 -07:00
Pierre Moreau	8c7acd87af	nv50,nvc0: Set only NEW_CP_GLOBALS upon binding Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-13 22:34:50 +01:00
Rob Clark	e73ac84b93	freedreno/ir3: lower extract_byte/word The following commits broke things by starting to feed us unhandled extract_u16/extract_u8 opcodes: commit `905ff86198` Author: Matt Turner <mattst88@gmail.com> AuthorDate: Wed Feb 3 14:28:31 2016 -0800 Commit: Matt Turner <mattst88@gmail.com> CommitDate: Fri Mar 4 11:52:34 2016 -0800 nir: Recognize open-coded extract_u16. commit `76289fbfa8` Author: Matt Turner <mattst88@gmail.com> AuthorDate: Thu Jan 21 09:09:48 2016 -0800 Commit: Matt Turner <mattst88@gmail.com> CommitDate: Fri Mar 4 11:52:34 2016 -0800 nir: Recognize open-coded extract_u8. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 14:10:57 -04:00
Ilia Mirkin	c1e4a6bfbf	nv50,nvc0: handle SQRT lowering inside the driver First off, st/mesa lowers DSQRT incorrectly (it uses CMP to attempt to find out whether the input is less than 0). Secondly the current approach (x * rsq(x)) behaves poorly for x = inf - a NaN is produced instead of inf. Instead we switch to the less accurate rcp(rsq(x)) method - this behaves nicely for all valid inputs. We still don't do this for DSQRT since the RSQ/RCP ops are really inaccurate, and don't even have Newton-Raphson steps right now. Eventually we should have a separate library function for DSQRT that does it more precisely (and perhaps move this lowering to the post-opt phase). This fixes a number of dEQP precision tests that were expecting better behavior for infinite inputs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-13 13:17:24 -04:00
Ilia Mirkin	b3e7fb5234	nv50/ir: avoid folding mul + add if the mul has a dnz Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-13 13:17:24 -04:00
Ilia Mirkin	a651bc027d	nvc0: fix blit triangle size to fully cover FB's > 8192x8192 The idea is that a single triangle will cover the whole area being drawn, allowing the blit shader to do its work. However the max fb size is 16384x16384, which means that the triangle we draw needs to be twice that in order to cover the whole area fully. Increase the size of the triangle to 32768x32768. This fixes a number of dEQP tests that were failing because a blit was involved which would miss some of the resulting texture. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-13 13:17:24 -04:00
Rob Clark	01b071d530	freedreno: OUT_RELOC vs OUT_RELOCW fixes Make sure we use OUT_RELOCW() in cases where the buffer is written to. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	f68c6951b8	freedreno/a4xx: hw binning Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	b3fe196e21	freedreno/a4xx: use generated headers for draw initiator No need to open-code this. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	2224ba5976	freedreno/a4xx: remove RB_RENDER_CONTROL patching Bitfields where shuffled around for the better on a4xx, so we don't need any patching on this one. It appears to be something we set entirely in the gmem code so no conflict between tiling and render state like we had in a3xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	8824a765a2	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	476551a21f	freedreno/a3xx: move where we deal w/ binning FS Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	dd9135c452	freedreno/a4xx: move where we deal w/ binning FS Move where we pick dummy FS for binning pass, so the whole driver sees the same dummy/no-op FS stage. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	09b3447344	freedreno/a3xx: constify the shader variants Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	5b955f09f7	freedreno/a4xx: constify the shader variants Most of the driver just needs read-only access, so constify.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:40 -04:00
Rob Clark	d9395e4ed8	freedreno/a3xx: remove duplicate mark of end of binning cmds Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:40 -04:00
Nicolai Hähnle	28d2a7e67c	radeonsi: avoid crash when a sampler state is bound for a buffer texture Sampler states don't really make sense with buffer textures, but they can be set anyway, so we need to be defensive here. This bug was lurking for a while and was finally noticed due to PBO uploads setting sampler states. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94284 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Laurent Carlier <lordheavym@gmail.com> Tested-by: Shawn Starr <shawn.starr@rogers.com>	2016-03-13 09:37:23 -05:00
Matt Turner	61b10b4eb7	i965: Use foreach_in_list_reverse_safe() macro. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-12 19:23:50 -08:00
Jason Ekstrand	98d58e7320	nir/clone: Add support for cloning a single function_impl Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	036b209484	nir/validate: Better function validation Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	f86f3c90aa	nir/print: Better function argument printing Since we aren't going to put the function parameters or the return variable in the list of locals, it won't get a proper declaration. This changes nir_print to print the type along with each parameter or return variable. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	13969565f9	nir/print: Factor variable name lookup into a helper Otherwise, we have a problem when we go to print functions with arguments because their names get added to the hash table during declaration which happens after we print the prototype. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	e4bebe8a02	nir: Create function parameters in function_impl_create Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	066d3c115e	nir: Add a helper for creating a "bare" nir_function_impl Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	2ef4754a20	nir: Add a new "param" variable mode for parameters and return variables Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	41ae553fda	nir/glsl: Remove dead function parameter handling code NIR has never been used on IR where we haven't already done function inlining so this code has been dead from the beginning. Let's just get rid of it for now. We can always put it back in if we decide to use NIR for function inlining at some point in the future. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jordan Justen	b83785d86d	anv/gen7: Add stall and flushes before switching pipelines This is a port of `18c76551ee` from OpenGL to Vulkan. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 13:13:37 -08:00
Jordan Justen	c8ec65a1f5	anv: Add flush_pipeline_before_pipeline_select flush_pipeline_before_pipeline_select adds workarounds required before switching the pipeline. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 13:13:37 -08:00
Jordan Justen	1b126305de	anv/genX: Add flush_pipeline_select_gpgpu Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 12:43:46 -08:00
Jason Ekstrand	41af9b2e51	HACK: Don't re-configure L3$ in render stages pre-BDW This fixes a "regression" on Haswell and prior caused by merging the gen7 and gen8 flush_state functions. Haswell should still work just fine if you're on a 4.4 kernel, but we really should make it detect the command parser version and do something intelligent.	2016-03-12 08:57:16 -08:00
Boyuan Zhang	6cf120ec77	st/va: add HEVC main 10 profile Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-11 22:33:56 -05:00
Boyuan Zhang	06c862d67d	radeon/video: enable HEVC main 10 decode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-11 22:33:56 -05:00
Boyuan Zhang	8be9efcce7	radeon/uvd: handle HEVC main 10 decode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-11 22:33:56 -05:00
Jason Ekstrand	753ebe4457	anv/x11: Reset the SHM fence before presenting the pixmap This seems to fix the flicker issue that I was seeing with dota2	2016-03-11 17:22:46 -08:00
Kristian Høgsberg Kristensen	9bff5266be	anv/x11: Add present support The old DRI3 implementation just used CopyArea instead of present. We still don't support all the MST fancyness, but it should at least avoid some copies and allow for. v2 (Jason Ekstrand): - Better object cleanup and destruction - Handle the CONFIGURE_NOTIFY event and return OUT_OF_DATE when needed - Track dirtyness via IDLE_NOTIFY rather than interating through the images sequentially	2016-03-11 16:54:17 -08:00
Jason Ekstrand	e920b184e9	anv/x11: Split image creation into a helper function This lets us clean up error handling and make it correct.	2016-03-11 12:28:34 -08:00
Jason Ekstrand	41a147904a	anv/wsi: Throttle rendering to no more than 2 frames ahead Right now, Vulkan apps can pretty easily DOS the GPU by simply submitting a lot of batches. This commit makes us wait until the rendering for earlier frames is comlete before continuing. By waiting 2 frames out, we can still keep the pipe reasonably full but without taking the entire system down. This is similar to what the GL driver does today.	2016-03-11 11:31:13 -08:00
Jason Ekstrand	132f079a8c	anv/gem: Use C99-style struct initializers for DRM structs This is more consistent with the way the rest of the driver works and ensures that all structs we pass into the kernel are zero'd out except for the fields we actually want to fill. We were previously doing then when building with valgrind to keep valgrind from complaining. However, we need to start doing this unconditionally as recent kernels have been getting touchier about this. In particular, as of kernel commit b31e51360e88 from Chris Wilson, context creation and destroy fail if the padding bits are not set to 0.	2016-03-11 11:31:03 -08:00
Ben Widawsky	d1ab544bb8	i965/chv: Display proper branding "Braswell" is a Cherryview based thing. It unfortunately requires extra information to determine its marketing name. Unlike all previous products, and hopefully all future ones, there is no unique 1:1 mapping of PCI device ID to brand string. I put up a fight about adding any complexity to our GL renderer string code for a very long time. However, a wise man made a comment to me that I couldn't argue with: if a user installs Windows on their hardware, the brand string should be the same as what we display in Linux. The Windows driver apparently does this check, so we should too. Note that I did manage to find a good use for this info anyway in the compute shader thread counts. v2: memcpy instead of strncpy, and some minor changes (Matt) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com	2016-03-11 11:17:28 -08:00
Ben Widawsky	5e6a43a001	i965/chv: Update lower min for CS threads We have better information now, and 28 was not a valid thing to support. 6 EUs per sublice with 7 threads per EU is the minimum supported config. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com	2016-03-11 11:17:28 -08:00
Ben Widawsky	3dc3dbc8d8	i965/chv: Check that compute threads are above threshold The way we are organizing this code, the statically configured max_cs_threads should always be the minimum value we actually support (ie. are aware of). As a result, we can fall back to that if we get invalid numbers from the kernel (ie. when the query succeeds, but the result is lower than expected). I was originally planning to use an assert, but there is no reason to be so mean. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com	2016-03-11 11:17:28 -08:00
Ben Widawsky	9dd20b715a	i965/chv: Use kernel provided info for max_cs_threads With the previous patches, the code can find out the actual number of available compute threads. It is enabled only for Cherryview since that is the only platform I know for a fact has shipped devices which can benefit from this. It seems like other platforms /might/ benefit from this because of fused configurations which /might/ have shipped. Fallback code is still there. v2: Some minor adjustments from Matt Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com	2016-03-11 11:17:28 -08:00
Ben Widawsky	38eb606884	i965: Query and store GPU properties from kernel Certain products are not uniquely identifiable based on device id alone. The kernel exports an interface to help deal with this. This patch merely introduces the consumer of the interface and makes sure nothing breaks. It is also possible to use these values for programming GPGPU mode, and I plan to do that as well. The interface was introduced in libdrm 2.4.60, which is already required, so it should all be fine. v2: Some minor changes recommended by Matt Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-11 11:17:28 -08:00
Nicolai Hähnle	9908b13af6	st/mesa: check that the image unit is valid in st_bind_images Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-11 11:53:40 -05:00
Bas Nieuwenhuizen	417b6721a0	radeonsi: Lazily re-set sampler views after disabling DCC Clear DCC flags if necessary when binding a new sampler view. v2: Do not reset DCC flags of bound sampler views. v3: Check that we have a real texture (Nicolai) Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-11 11:51:15 -05:00
Marek Olšák	af3454cad5	st/mesa: remove ST_NEW_MESA flag (v2) Only used indirectly when checking dirty.st != 0 v2: also update st_cb_compute.c Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-11 16:07:18 +01:00
Nicolai Hähnle	e502801d98	r600g: clear compressed_depthtex/colortex_mask when binding buffer texture Found by inspection of the source based on a bisected bug report. This bug has been in the code for a long time, but the more recent PBO upload feature exposed it because it leads to more uses of buffer textures. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94388 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-11 08:00:15 -05:00
Ilia Mirkin	f8ea98e4ec	st/mesa: add GL_ARB_shader_atomic_counter_ops support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-10 22:36:17 -05:00
Ilia Mirkin	075a5742bf	mesa: add GL_ARB_shader_atomic_counter_ops support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-10 22:34:46 -05:00
Ilia Mirkin	a8819fb1ff	nvc0: add support for TGSI FMA ops This will allow the nouveau backend to not try and split up ops that are fused in GLSL. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-10 22:34:28 -05:00
Nicolai Hähnle	59c5508b9a	radeonsi: update compressed_colortex_masks when a cmask is created or disabled Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-10 18:22:52 -05:00
Nicolai Hähnle	da68a9b215	radeonsi: move si_decompress_textures to si_blit.c Since it is all about calling into blitter functions, it makes more sense here. This change also reduces the size of the interfaces between .c files. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-10 18:22:49 -05:00
Nicolai Hähnle	f03c9e5692	r600g: update compressed_colortex_masks when a cmask is created or disabled Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-10 18:22:46 -05:00
Nicolai Hähnle	784269aa40	gallium/radeon: notify all contexts when cmasks are enabled/disabled There is an annoying corner case that I stumbled across while looking into piglit's arb_shader_image_load_store/execution/load-from-cleared-image.shader_test (which can be easily adapted to demonstrate the bug without the ARB_shader_image_load_store extension) When we bind a texture and then clear it using glClear (by attaching it to the current framebuffer) for the first time, we allocate a separate cmask for the texture to do fast clear, but the corresponding bit in compressed_colortex_mask is not set. Subsequent rendering will use incorrect data. Conversely, when a currently bound texture with an existing cmask is exported leading to that cmask being disabled, the compressed_colortex_mask bit will remain set, leading to an assertion later on in debug builds. Since iterating through all contexts and/or remembering where every texture is bound would be costly, and cmask enable/disable should be rare, we will maintain a global counter to signal contexts that they must update their compressed_colortex_masks. This patch introduces the global counter, and subsequent patches will do the mask update. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-10 18:22:00 -05:00
Kenneth Graunke	9ea00c6f6b	i965: Set a proper _BaseFormat for window system renderbuffers in ES. intel_alloc_private_renderbuffer_storage did: rb->_BaseFormat = _mesa_base_fbo_format(ctx, internalFormat); Unfortunately, internalFormat was usually an unsized format (such as GL_DEPTH_COMPONENT). In OpenGL ES, _mesa_base_fbo_format() refuses to accept unsized formats, and returns 0 rather than a real base format. This meant that we ended up with a completely bogus rb->_BaseFormat for window system buffers on OpenGL ES. All other renderbuffer allocation functions in intel_fbo.c instead use the mesa_format, and do: rb->_BaseFormat = _mesa_get_format_base_format(...); We can do likewise, using rb->Format. This appears to work just fine. dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initial failed, as it tried to perform a GL_FRAMEBUFFER_ATTACHMENT_DEPTH_SIZE query on the window system depth buffer. That query relies on a proper rb->_BaseFormat being set, so it broke because rb->_BaseFormat was 0 due to the above bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94458 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-10 11:23:52 -08:00
Kenneth Graunke	e032e4ad5a	glcpp: Fix locations when encounting "#<NEWLINE>". We were failing to reset our location tracking when encountering a NEWLINE in the <HASH> state. Rip the code from the <*>{NEWLINE} rule, which handles this properly. Also, update 146-version-first-hash.c to have proper expectations. When I introduced the test, I didn't verify that the line/column numbers were correct, and it turns out they varied based on the type of newline ending. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94447 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-10 11:23:26 -08:00
Jason Ekstrand	1f3d582cba	isl/surface_state: Set the clear color	2016-03-10 10:41:52 -08:00
Jason Ekstrand	8c819b8c2b	genxml/gen75: Add the clear color bits to RENDER_SURFACE_STATE	2016-03-10 10:41:52 -08:00
Jason Ekstrand	6f47ed28b4	isl: Add more helpers for determining if a format is an integer format	2016-03-10 10:41:52 -08:00
Jason Ekstrand	b0e423cc4f	isl: Remove redundant check The green channel was checked twice.	2016-03-10 10:41:52 -08:00
Tim Rowley	84f857bef7	gallium/swr: remove use of BYTE from swr driver Remove use of a win32-style type leaked from the swr rasterizer. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-03-10 11:20:58 -06:00
Samuel Pitoiset	dad3e5f4ef	nvc0: expose SM35 perf counters to AMD_performance_monitor Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:40 +01:00
Samuel Pitoiset	0e511400de	nvc0: add driver metrics for SM35 (GK110) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:38 +01:00
Samuel Pitoiset	bf840aa523	nvc0: add MP performance counters for SM35 (GK110) Because compute support is not enabled by default for these chipsets, NVF0_COMPUTE=1 needs to be used, along with GALLIUM_HUD to enable performance counters. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:35 +01:00
Samuel Pitoiset	f289e99dee	nvc0: explode config of Kepler hardware SM events This is really verbose but most of the configuration will be reused for SM35 (GK110). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:32 +01:00
Samuel Pitoiset	a0ce8536b3	nvc0: rework the driver metrics infrastructure This follows the same design as MP perf counters. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:29 +01:00
Samuel Pitoiset	41fb87249a	nvc0: rework the MP counters infrastructure This mainly improves how we define the different list of queries. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:26 +01:00
Marek Olšák	7b29188a3f	egl: clean up typedef madness in the backend API let's use the dd.h format Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-10 18:03:14 +01:00
Iago Toral Quiroga	3e3de9ec0a	glsl: report correct number of allowed vertex inputs and fragment outputs Before we would always report 16 for both and we would only fail if either one exceeded 16. Now we fail if the maximum for each is exceeded, even if it is smaller than 16 and we report the correct maximum. Also, expand the size of to_assign[] to 32. There is code at the top of the function handling max_index up to 32, so this just makes the code more consistent. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-10 08:48:53 +01:00
Vinson Lee	d46feee697	nouveau: Fix clang reserved-user-defined-literal error. CXX codegen/nv50_ir.lo In file included from codegen/nv50_ir.cpp:28: ./nouveau_debug.h:19:30: error: invalid suffix on literal; C++11 requires a space between literal and identifier [-Wreserved-user-defined-literal] fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args) ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-09 23:00:45 -08:00
Kenneth Graunke	3823b53ff8	mesa: Make glGetInteger64v convert float/doubles to 32-bit integers. According to the GL 4.4 core specification, section 2.2.2 ("Data Conversions For State Query Commands"): "If a command returning integer data is called, such as GetIntegerv or GetInteger64v, a boolean value of TRUE or FALSE is interpreted as one or zero, respectively. A floating-point value is rounded to the nearest integer, unless the value is an RGBA color component, a DepthRange value, or a depth buffer clear value. In these cases, the query command converts the floating-point value to an integer according to the INT entry of table 18.2; a value not in [−1, 1] converts to an undefined value." The INT entry of table 18.2 shows that b = 32, meaning the expectation is to convert it to a 32-bit integer value. Fixes: dEQP-GLES3.functional.state_query.floats.blend_color_getinteger64 dEQP-GLES3.functional.state_query.floats.color_clear_value_getinteger64 dEQP-GLES3.functional.state_query.floats.depth_clear_value_getinteger64 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94456 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-09 19:44:18 -08:00
Nanley Chery	7fbbad0170	anv/blit2d: Use the tiling enum for simplicity Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Nanley Chery	514c055717	anv/meta: Prefix anv_ to meta_emit_blit() Follow the convention for non-static functions. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Nanley Chery	627728cce5	anv/meta: Split anv_meta_blit.c into three files The new organization is as follows: * anv_meta_blit.c: Blit and state setup/teardown commands * anv_meta_copy.c: Copy and update commands * anv_meta_blit2d.c: 2D Blitter API commands Also, change the formatting to contain most lines within 80 columns. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Nanley Chery	f391683922	anv/meta: Make meta_emit_blit() public This can be reverted if the only other consumer, anv_meta_blit2d(), uses a different method. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Nanley Chery	ddbc645846	anv/meta: Store src and dst usage flags in a variable Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Nanley Chery	7ebbc3946a	anv/meta: Minimize height of images used for copies In addition to demystifying the value being added to the height, this future-proofs the code for new tiling modes and keeps the image height as small as possible. v2: Actually use the smallest height possible. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Emil Velikov	3dc2630e45	gallium/radeon: use explicit drm_major, drm_minor check Just like everywhere else in the radeon codebase. v2: Don't forget about drm_major == 3 (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 17:25:22 +00:00
Emil Velikov	b9c5c4af6d	egl/x11: check the return value of xcb_dri2_get_buffers_reply() ... before using it. The function can return NULL, which we should check prior to refererencing it in the next function(s). Cc: Fabian Vogt <fvogt@suse.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93667 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-09 17:25:22 +00:00
Emil Velikov	373f118c6c	gallium: do not wrap header inclusion in Add one missing extern C guard within include/pipe/p_video_enums.h, and remove the wrapping throughout gallium. On Haiku one could even use the gallium debug_printf() although that's another topic. v2: Leave dbghelp.h as is (Jose) Cc: Jose Fonseca <jfonseca@vmware.com> Cc: Brian Paul <brianp@vmware.com> Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-09 17:21:39 +00:00
Dieter Nützel	69d389c52f	opencl: fix .gitignore for .install-gallium-links Fixes: `0b6157e971` "install-gallium-links: port changes from install-lib-links" v2: move this to the top level .gitignore and added Fixes: like Emil Velikov <emil.l.velikov@gmail.com> suggested Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:52 +00:00
Emil Velikov	f3e23ead53	egl: remove remnants of MESA_drm_display Last set in st/egl, unused in mesa-demos and superseded by EGL_KHR_platform_gbm. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	2295a4b1e1	egl: remove final pieces of KHR_vg_parent_image Similar to previous commit - unused/unset for a long time. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	c85544a10c	glapi: remove the final function offset tags A commit earlier this year reworked out python scripts to use a separate file for these. Followed by removing support from the parser, and removing all of the offset tags. Seems like we either missed a few, or people added them by mistake. Either way let's nuke the ones that are still around. Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	3ffab9a89c	winsys/amdgpu/addrlib: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	a07192bd63	mesa/main: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	5351dc1522	i915: limit extern "C" hack only for libdrm headers Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	cf215d92f6	xmesa: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:50 +00:00
Emil Velikov	2af3a0ca6f	util/sha: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:50 +00:00
Emil Velikov	d426c17550	egl/wayland: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:50 +00:00
Emil Velikov	750da80b34	gbm: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:50 +00:00
Nicolai Hähnle	9f06e7f5c1	st/mesa: shader image atoms must be before framebuffer update The reason is that the shader image atoms call st_finalize_texture, which may set ST_NEW_FRAMEBUFFER. This fixes an assertion triggered by a subtest of piglit's arb_shader_image_load_store-invalid. v2: add comment explaining order constraints (suggested by Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 11:40:06 -05:00
Nicolai Hähnle	4eb416bd9d	gallivm: special case TGSI_OPCODE_STORE This instruction has the resource (buffer or image) as a destination to represent the writemask for SSBO writes. However, this is obviously not a "real" destination for the purpose of emitting LLVM IR. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 11:39:55 -05:00
Nicolai Hähnle	10b2b584ee	tgsi: set correct output mode for RESQ Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 11:39:43 -05:00
Marek Olšák	dcb2b77823	gallium: add CAPs returning PCI device location Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-09 15:02:28 +01:00
Marek Olšák	737b6ed13e	winsys/amdgpu: get PCI info This will be queried by the OpenCL stack using an interop call. I have tested that the values match lspci. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:28 +01:00
Marek Olšák	ec74deeb24	radeonsi: set amdgpu metadata before exporting a texture Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:28 +01:00
Nicolai Hähnle	ff7e9412be	radeonsi: extract the texture descriptor computation into its own function This will allow this code to be re-used for shader images. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-09 15:02:27 +01:00
Nicolai Hähnle	1197c69bdd	radeonsi: extract the buffer descriptor computation into its own function This will allow it to be re-used for shader image descriptors. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-09 15:02:27 +01:00
Nicolai Hähnle	2bf8ee34b8	radeonsi: remove resource field from si_sampler_view view->resource is redundant with view->base.texture, so get rid of it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	2dec5e09e1	radeonsi: accept pipe_resource in si_sampler_view_add_buffer and rename .._buffers -> .._buffer Based loosely on Nicolai's patch. This will make it easier to cherry-pick Nicolai's patches from his image support branch. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	f18fc70d6f	radeonsi: disable DCC on handle export if expecting write access This should be okay except that sampler views and images are not re-set. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Bas Nieuwenhuizen	1e48ec7571	radeonsi: add DCC decompression (v2) This is currently not needed but will be necessary when we have features that do not work with DCC enabled, such as image stores and sharing non-scanout surfaces. v2: Marek: rebase, remove decompression from si_flush_resource (not needed) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	b744ac9f44	radeonsi: allocate DCC in the same backing buffer as the texture To allow sharing textures with DCC enabled. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	60c08aa90b	gallium/radeon: disable CMASK on handle export if sharing doesn't allow it (v2) v2: remove the list of all contexts Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	970b979da1	gallium/radeon: eliminate fast color clear before sharing Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	abac6bf67a	gallium/radeon: don't use fast color clear if sharing doesn't allow it Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	d4e847ea33	gallium/radeon: disallow handle export for MSAA & depth textures Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	d95f593758	gallium/radeon: remember that texture_from_handle was called and its flags Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	c034d3dde0	gallium/radeon: check that handle usage doesn't change for a resource Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	6b187bbd9f	gallium/radeon: disallow reallocation of shared buffers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	ecbd3aba17	gallium/radeon: if we can't discard a whole resource, discard the range instead Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	afdaffcbdb	gallium/radeon: buffer valid range tracking only works with unshared buffers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	be73d35829	gallium/radeon: don't set texture metadata for buffers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	f914779c75	gallium/radeon: set texture metadata only once Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	69d8b75114	gallium/radeon: clean up r600_texture_get_handle Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	e3cee38e13	gallium/radeon: move code initializing texture metadata to its own function Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	f4aa3256ef	winsys/amdgpu: allow drivers to set/get opaque metadata Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	bd1feb2827	gallium/radeon: rename winsys buffer_get/set_tiling to buffer_get/set_metadata Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	6011d7cf25	gallium/radeon: remove rcs parameter from radeon_winsys::buffer_set_tiling This was needed for DRM < 2.12.0 where the kernel was rewriting tiling flags in IBs. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:25 +01:00
Marek Olšák	260ef9c9be	gallium/radeon: use a structure for passing tiling flags from/to winsys and call it radeon_bo_metadata Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:25 +01:00
Marek Olšák	82db518f15	gallium: add external usage flags to resource_from(get)_handle (v2) This will allow drivers to make better decisions about texture sharing for DRI2, DRI3, Wayland, and OpenCL. v2: add read/write flags, take advantage of __DRI_IMAGE_USE_BACKBUFFER Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-03-09 15:02:25 +01:00
Axel Davy	d943ac432d	dri: add backbuffer use flag This will be used by the next commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-09 15:02:25 +01:00
Timothy Arceri	2188c77a0e	glsl: dont allow undefined array sizes in ES This applies the rule to empty declarations. Fixes: dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_vertex dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_fragment Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-09 20:30:42 +11:00
Jason Ekstrand	248ab61740	anv/cmd_buffer: Pull the core of flush_state into genX_cmd_buffer	2016-03-08 17:10:05 -08:00
Jason Ekstrand	28cbc45b3c	anv/cmd_buffer: Split flush_state into two functions	2016-03-08 16:54:07 -08:00
Jason Ekstrand	42b4c0fa6e	anv: Pull all of the genX_foo functions into anv_genX.h This way we only have to declare them each once and we get it for all gens at a single go.	2016-03-08 16:49:08 -08:00
Tamil velan	353a4f844f	radeon/uvd: increase max height to 4096 for VI and newer With this issue 'mpv --hwdec=vdpau --vo=vdpau <stream>' fails for vdpau decode if the stream height is 4096. Vdpau decode of height upto 4096 is necessary usecase on amdgpu driver for VI and newer platforms. The fix is in driver specific implementation of "Decoder Query Capabilities" API to return 4096 for VI and newer platforms. With this fix vdpauinfo reports height support as 4096 and mpv for vdpau decode works fine for 4096 height streams. Signed-off-by: Tamil velan <Tamil-Velan.Jayakumar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-08 19:01:19 -05:00
Bas Nieuwenhuizen	6373845d98	winsys/amdgpu: enlarge buffer_indices_hashlist Enlarge the buffer hashlist to prevent large numbers of misses due to adding more buffers than can be cached in the hashlist. The game I tested had CS's with up to 1500 buffers and the overhead of amdgpu_lookup_buffer for various sizes was: 4096 1.97% (new value) 2048 4.37% 1024 6.92% 512 9.47% (old value) (percentage of CPU usage in render thread as determined by perf) The time spent in amdgpu_add_buffer self is ~4.2% in all cases and for 4096 the time needed to clear the hashlist is still < 0.10%, so I am not expecting significant regressions. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 00:52:07 +01:00
Jason Ekstrand	bbbdd32c19	anv/meta_clear: Use repclear again	2016-03-08 15:40:11 -08:00
Jason Ekstrand	dc504a51fb	anv/pipeline: Unconditionally emit PS_BLEND on gen8+ Special-casing the PS_BLEND packet wasn't really gaining us anything. It's defined to be more-or-less the contents of blend state entry 0 only without the indirection. We can just copy-and-paste the contents. If there are no valid color targets, then blend state 0 will be 0-initialized anyway so it's basically the same as the special case we had before.	2016-03-08 15:40:11 -08:00
Jason Ekstrand	cce65471b8	anv: Compact render targets Previously, we would always emit all of the render targets in the subpass. This commit changes it so that we compact render targets just like we do with other resources. Render targets are represented in the surface map by using a descriptor set index of UINT16_MAX.	2016-03-08 15:40:11 -08:00
Samuel Pitoiset	32e848b016	nvc0: add a new validation path for compute This makes use of the new state validation interface to be consistent with 3d. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-09 00:19:21 +01:00
Samuel Pitoiset	db9b41d302	nvc0: rework the validation path for 3D This exposes an interface for state validation that will be also used to rework the compute validation path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-09 00:19:16 +01:00
Jordan Justen	a100a57e30	i965/hsw: Initialize SLM index in state register For Haswell, we need to initialize the SLM index in the state register. This can be copied out of the CS header dword 0. v2: * Use UW move to avoid changing upper 16-bits of sr0.1 (mattst88) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94081 Fixes: piglit arb_compute_shader/execution/shared-atomics.shader_test Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-08 14:27:18 -08:00
Jordan Justen	d8347f12ea	i965/compute: Skip SIMD8 generation if it can't be used If the local workgroup size is sufficiently large, then the SIMD8 program can't be used. In this case we can skip generating the SIMD8 program. For complex programs this can save a significant amount of time. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-08 14:27:18 -08:00
Jordan Justen	e1d54b1ba5	i965/fs: Allow spilling for SIMD16 compute shaders For fragment shaders, we can always use a SIMD8 program. Therefore, if we detect spilling with a SIMD16 program, then it is better to skip generating a SIMD16 program to only rely on a SIMD8 program. Unfortunately, this doesn't work for compute shaders. For a compute shader, we may be required to use SIMD16 if the local workgroup size is bigger than a certain size. For example, on gen7, if the local workgroup size is larger than 512, then a SIMD16 program is required. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93840 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-08 14:27:18 -08:00
Timothy Arceri	91630d7453	glsl: don't always reject shaders with mismatching ifc blocks Since we store some member qualifiers in the interface type we need to be more careful about rejecting shaders just because the pointer doesn't match. Its perfectly valid for some qualifiers such as precision to not match across shader interfaces. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-09 09:21:42 +11:00
Timothy Arceri	3026b3565a	glsl: make interstage_match() static Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-09 09:21:36 +11:00
Timothy Arceri	ebc419fcbd	glsl: don't validate ifc blocks using validation meant for variables Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-09 09:21:31 +11:00
Kenneth Graunke	19f13b2096	mesa: Fix error code for GetFramebufferAttachmentParameter in ES 3.0+. The ES 3.0+ specifications contain the exact same text as the OpenGL specification, which says that we should return GL_INVALID_OPERATION. ES 2.0 contains different text saying we should return GL_INVALID_ENUM. Previously, Mesa chose the error code based on API (GL vs. ES). This patch makes ES 3.0+ follow the GL behavior. ES 2 remains as is. Fixes dEQP-GLES3.functional.fbo.api.attachment_query_empty_fbo. However, breaks the dEQP-GLES2 variant of the same test for drivers which silently promote to ES 3.0. This can be worked around by exporting MESA_GLES_VERSION_OVERRIDE=2.0, but is a bug in dEQP-GLES2. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-08 12:46:28 -08:00
Kenneth Graunke	8b3496f378	mesa: Add GL_RED and GL_RG to ES3 effective internal format mapping. The dEQP-GLES3.functional.fbo.completeness.renderable.texture. {color0,depth,stencil}.{red,rg}_unsigned_byte tests appear to expect GL_RED/GL_RG and GL_UNSIGNED_BYTE to map to GL_R8/GL_RG8, rather than returning an INVALID_OPERATION error. This makes perfect sense. However, RED and RG are strangely missing from the ES 3.0/3.1/3.2 spec's "Effective internal format corresponding to external format and type" tables. It may be worth filing a spec bug. Fixes the 6 dEQP tests mentioned above. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-03-08 12:46:28 -08:00
Samuel Pitoiset	752769e053	nv50,nvc0: make sure to destroy the mutex used for blits This mutex is initialized when the blitter is created, but it is never destroyed. This doesn't hurt anything but it makes sense to destroy it at blitter deletion. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-08 21:24:46 +01:00
Marek Olšák	3146014d5f	gallium/radeon: don't use temporary buffers for persistent mappings Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-08 20:08:52 +01:00
Jason Ekstrand	14b18aba89	nir: Add a pass for lower indirect variable dereferences This new pass lowers load/store_var intrinsics that act on indirect derefs to if-ladder of direct load/store_var intrinsics. The if-ladders perform a simple binary search on the indirect. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-08 10:41:54 -08:00
Alejandro Piñeiro	ef76ea4ba9	i965/fs/nir: "surface_access::" prefix not needed "using namespace brw::surface_access" is already present at the top of the source file. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-08 17:55:28 +01:00
Brian Paul	6857420e79	mesa: fix malformed assertion in _image_format_class_to_glenum() Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-03-08 08:42:56 -07:00
Brian Paul	3ed8729f7b	program: minor whitespace clean-ups in program_parse_extra.c	2016-03-08 08:42:56 -07:00
Christian König	37402aa4c6	st/mesa: conditionally enable GL_NV_vdpau_interop Only enable it when we compile the state tracker as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-08 13:00:04 +01:00
Christian König	e148a3b6e9	radeon/uvd: disable MPEG1 The hardware simply doesn't support that correctly. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-08 12:57:08 +01:00
Alejandro Piñeiro	0548844e86	i965/vec4/nir: no need to use surface_access:: to call emit_untyped_atomic Now that brw_vec4_visitor::emit_untyped_atomic was removed, there is no need to explicitly set it. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-08 08:22:26 +01:00
Alejandro Piñeiro	d3a89a7c49	i965/vec4/nir: remove emit_untyped_surface_read and emit_untyped_atomic at brw_vec4_visitor surface_access emit_untyped_read and emit_untyped_atomic provides the same functionality. v2: surface parameter of emit_untyped_atomic is a const, no need to specify default predicate on emit_untyped_atomic, use retype (Francisco Jerez). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-08 08:22:26 +01:00
Alejandro Piñeiro	0c5c2e2c93	i965/vec4: pass the correct src_sz to emit_send at emit_untyped_atomic If the src is invalid, so src size is zero, the src_sz passed to emit send should be zero too, instead of a default 1 if we are in a simd4x2 case. This can happens if using emit_untyped_atomic for an atomic dec/inc. v2: use the proper src_sz when calling emit_send, instead of just avoid loading src at emit_send if BAD_FILE (Francisco Jerez) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-08 08:22:26 +01:00
Kenneth Graunke	ea9fa5ff05	glcpp: Remove empty mid-rule action which changes test behavior. Apparently this causes a slight difference in the parser's token expectations, leading to a different error message. It seems harmless, but I wanted to be cautious and separate it out. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-07 23:02:05 -08:00
Kenneth Graunke	e816c8b54a	glcpp: Clean up most empty mid-rule actions left by previous commit. I didn't want to pollute the previous patch with all the $4 -> $3 changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-07 23:02:03 -08:00
Kenneth Graunke	639bbe3cb4	glcpp: Delete unnecessary implicit version resolves. We now have a bigger hammer. The HASH_TOKEN NEWLINE rule still needs to exist to ensure the 146-version-hash-first.c test still passes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-07 23:02:01 -08:00
Kenneth Graunke	07ec67d85c	glcpp: Implicitly resolve version after the first non-space/hash token. We resolved the implicit version directive when processing control lines, such as #ifdef, to ensure any built-in macros exist. However, we failed to resolve it when handling ordinary text. For example, int x = __VERSION__; should resolve __VERSION__ to 110, but since we never resolved the implicit version, none of the built-in macros exist, so it was left as is. This also meant we allowed the following shader to slop through: 123 #version 120 Nothing would cause the implicit version to take effect, so when we saw the #version directive, we thought everything was peachy. This patch makes the lexer's per-token action resolve the implicit version on the first non-space/newline/hash token that isn't part of a #version directive, fulfilling the GLSL language spec: "The #version directive must occur in a shader before anything else, except for comments and white space." Because we emit #version as HASH_TOKEN then VERSION_TOKEN, we have to allow HASH_TOKEN to slop through as well, so we don't resolve the implicit version as soon as we see the # character. However, this is fine, because the parser's HASH_TOKEN NEWLINE rule does resolve the version, disallowing cases like: # #version 120 This patch also adds the above shaders as new glcpp tests. Fixes dEQP-GLES2.functional.shaders.preprocessor.predefined_macros. {gl_es_1_vertex,gl_es_1_fragment}. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-07 23:01:43 -08:00
Jason Ekstrand	75af420cb1	anv/pipeline: Move binding table setup to its own helper	2016-03-07 22:24:31 -08:00
Jason Ekstrand	2308891ede	anv: Store CPU-side fence information in the BO This reduces the number of allocations a bit and cuts back on memory usage. Kind-of a micro-optimization but it also makes the error handling a bit simpler so it seems like a win.	2016-03-07 22:23:44 -08:00
Jason Ekstrand	f61d40adc2	anv/allocator: Better casting in PFL macros We cast he constant 0xfff values to a uintptr_t before applying a bitwise negate to ensure that they are actually 64-bit when needed. Also, the count variable doesn't need to be explicitly cast, it will get upcast as needed by the "\|" operation.	2016-03-07 22:23:44 -08:00
Jason Ekstrand	3d4f2b0927	anv/allocator: Move the alignment assert for the pointer free list Previously we asserted every time you tried to pack a pointer and a counter together. However, this wasn't really correct. In the case where you try to grab the last element of the list, the "next elemnet" value you get may be bogus if someonoe else got there first. This was leading to assertion failures even though the allocator would safely fall through to the failure case below.	2016-03-07 22:23:44 -08:00
Jason Ekstrand	8c2b9d1529	anv/bo_pool: Allow freeing BOs where the anv_bo is in the BO itself	2016-03-07 22:23:44 -08:00
Tim Rowley	90f9df3210	gallium/swr: fix issues preventing a 32-bit build Not a currently tested configuration, but these couple of small changes allow a 32-bit build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94383 Acked-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-07 17:22:24 -06:00
Nanley Chery	181b142fbd	anv/device: Up device limits for 3D and array texture dimensions The limit for these textures is 2048 not 1024. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-07 15:21:50 -08:00
Tim Rowley	035d39b539	gallium/swr: remove use of UINT64 from swr_fence Remove use of a win32-style type leaked from the swr rasterizer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-07 16:58:48 -06:00
Jason Ekstrand	428ffc9c13	anv/device: Actually free the CPU-side fence struct again In `23de78768`, when we switched from allocating individual BOs to using the pool for fences, we accidentally deleted the free.	2016-03-07 14:50:52 -08:00
Kenneth Graunke	af41c0b7e0	glsl: Add function parameters to the parser symbol table. In a shader such as: struct S { float f; } float identity(float S) { return S; } we would think that "S" in "return S" referred to a structure, even though it's shadowed by the "float S" parameter in the inner struct. This led to the parser's grammar seeing TYPE_IDENTIFIER and getting confused. Fixes dEQP-GLES2.functional.shaders.scoping.valid. function_parameter_hides_struct_type_{vertex,fragment}. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-07 14:09:55 -08:00
Kenneth Graunke	c4960068d5	glsl: Add single declaration variables to the symbol table too. The lexer/parser use a symbol table to classify identifiers as variables, functions, or structure types. For some reason, we neglected to add variables in simple declarations such as int x = 5; but did add subsequent variables in multi-declarations: int x = 5, y = 6; // y gets added, but not x, for some reason Fixes four dEQP-GLES2.functional.shaders.scoping.valid subcases: - local_int_variable_hides_struct_type_vertex - local_int_variable_hides_struct_type_fragment - local_struct_variable_hides_struct_type_vertex - local_struct_variable_hides_struct_type_fragment Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-07 14:09:31 -08:00
Kenneth Graunke	1107e48b9a	mesa: Change GLboolean to bool in GenerateMipmap target checker. This is not API facing, so just use bool. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-07 14:01:34 -08:00
Kenneth Graunke	2f8a43586e	mesa: Make GenerateMipmap check the target before finding an object. If glGenerateMipmap was called with a bogus target, then it would pass that to _mesa_get_current_tex_object(), which would raise a _mesa_problem() telling people to file bugs. We'd then do the proper error checking, raise an error, and bail. Doing the check first avoids the _mesa_problem(). The DSA variant doesn't take a target parameter, so we leave the target validation exactly as it was in that case. Fixes one dEQP GLES2 test: dEQP-GLES2.functional.negative_api.texture.generatemipmap.invalid_target. v2: Rebase on Antia's recent patch to this area. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-07 14:01:22 -08:00
Samuel Pitoiset	8f99c1bbce	gm107/ir: add emission for ATOMS This allows to perform atomic operations on shared memory. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 22:13:14 +01:00
Samuel Pitoiset	7f8565f0b2	tgsi: fix parsing of shared memory declarations The SHARED TGSI keyword is only allowed with TGSI_FILE_MEMORY and not with TGSI_FILE_BUFFER. I have found this by using the nouveau_compiler from command line. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-03-07 22:13:08 +01:00
Samuel Pitoiset	c82086f7e9	gm107/ir: add emission for BAR Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 18:39:50 +01:00
Samuel Pitoiset	8a109c0375	gk110/ir: add missing src predicate emission for BAR.RED Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 18:39:48 +01:00
Samuel Pitoiset	f4d2d49152	gk110/ir: allow to emit immediates for BAR Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 18:39:46 +01:00
Samuel Pitoiset	cba89fdaa1	gk110/ir: fix wrong emission of BAR.SYNC Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 18:39:43 +01:00
Samuel Pitoiset	5777e87bed	nvc0/ir: make sure that thread count immediate for BAR fit The limit of the thread count immediate value is 12 bits. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 18:39:41 +01:00
Brian Paul	3af78b426e	svga: add new surface-write-flushes HUD query To know when we're flushing the command buffer because we need to write to surface in the command buffer. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-07 09:33:15 -07:00
Brian Paul	7e8cf34546	svga: add new flush-time HUD query To measure the time spent flushing the command buffer. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-07 09:33:15 -07:00
Brian Paul	903afc370f	svga: also dump SVGA3D_BUFFER surfaces in svga_screen_cache_dump() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-07 09:33:15 -07:00
Kristian Høgsberg Kristensen	32aa01663f	anv: Quiet pTessellationState warning Some application pass a dummy for pTessellationState which results in a lot of noise. Only warn if we're actually given tessellation shadear stages.	2016-03-06 22:06:24 -08:00
Ilia Mirkin	0941ef3dd5	mesa: flip current tf object back to default if current is being deleted In the rather unusual case of Bind + Delete, we need to make sure that we unbind the current tf object. Fixes dEQP-GLES3.functional.lifetime.delete_bound.transform_feedback Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-07 00:36:08 -05:00
Ilia Mirkin	f6827e20d1	glsl: avoid stack smashing when there are too many attributes This fixes a crash in dEQP-GLES3.functional.transform_feedback.array_element.separate.points.lowp_mat3x2 and likely others. The vertex shader has > 16 input variables (without explicit locations), which causes us to index outside of the to_assign array. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-07 00:36:08 -05:00
Jason Ekstrand	23de78768b	anv: Create fences from the batch BO pool Applications may create a lot of fences, perhaps as much as one per vkQueueSubmit. Really, they're supposed to use ResetFence, but it's easy enough for us to make them crazy-cheap so we might as well.	2016-03-06 14:26:52 -08:00
Francisco Jerez	3dd0441f6c	i965/vec4: Propagate swizzles correctly during copy propagation. This simplifies the code that iterates over the per-component values found in the matching copy_entry struct and checks whether the register regions that were copied to each component are similar enough to be treated as a single (reswizzled) value which can be propagated into the current instruction. Aside from being scattered between opt_copy_propagation(), try_copy_propagate(), and try_constant_propagate(), what I found terribly confusing about the preexisting logic was that opt_copy_propagation() tried to reorder the array of values according to the swizzle of the instruction source, which meant one would have had to invert the reordering applied at the top level in order to find out which component to take from each value (we were just taking the i-th component from the i-th value, which is not correct in general). The saturate mask was also being swizzled incorrectly. This consolidates the logic for matching multiple components of a copy_entry into a single function which returns the result as a regular src_reg on success, as if the copy had been performed with a single MOV instruction copying all components of the src_reg into the destination. Fixes several ARB_vertex_program MOV test-cases from: https://cgit.freedesktop.org/~kwg/piglit/log/?h=arb_program Acked-by: Matt Turner <mattst88@gmail.com>	2016-03-06 12:22:40 -08:00
Francisco Jerez	c70b7c80e3	i965: Don't try copy propagation if constant propagation succeeded. It cannot get any better. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-06 12:22:40 -08:00
Francisco Jerez	dcf5e19e65	i965/vec4: Use swizzle() to swizzle immediates during constant propagation. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-06 12:22:40 -08:00
Francisco Jerez	ff7a2b489e	i965: Add support for swizzling arbitrary immediates to (brw_)swizzle(). Scalar immediates used to be handled correctly by swizzle() (as the identity) but since commit `58fa9d47b5` it will corrupt the contents of the immediate. Vector immediates were never handled correctly, but we had ad-hoc code to swizzle VF immediates in the vec4 copy propagation pass. This takes care of swizzling V and UV in addition. v2: Don't implement swizzling of V/UV immediates (Matt). If you need to swizzle an integer vector immediate in the future apply the following diff to go back to v1: --- a/src/mesa/drivers/dri/i965/brw_eu.c +++ b/src/mesa/drivers/dri/i965/brw_eu.c @@ -119,11 +119,10 @@ brw_swap_cmod(uint32_t cmod) static unsigned imm_shift(enum brw_reg_type type, unsigned i) { - assert(type != BRW_REGISTER_TYPE_UV && type != BRW_REGISTER_TYPE_V && - "Not implemented."); - if (type == BRW_REGISTER_TYPE_VF) return 8 * (i & 3); + else if (type == BRW_REGISTER_TYPE_UV \|\| type == BRW_REGISTER_TYPE_V) + return 4 * (i & 7); else return 0; } Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-06 12:22:40 -08:00
Francisco Jerez	537d3df974	i965: Pass symbolic swizzle to brw_swizzle() as a single argument. And replace brw_swizzle1() with brw_swizzle(). Seems slightly cleaner and will allow reusing brw_swizzle() in the vec4 back-end more easily. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-06 12:22:39 -08:00
Ilia Mirkin	ff085d014e	nvc0: reset TFB bufctx when we no longer hold a reference to the buffers This fixes some use-after-free situations in dEQP when an xfb state is removed, and then a clear is triggered, which only does a partial validation. It would attempt to read the no-longer-valid buffers, resulting in crashes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-06 10:14:52 -05:00
Jason Ekstrand	21ee5fd326	anv: Emit null render targets v2 (Francisco Jerez): Add the state_offset to the surface state offset	2016-03-05 20:47:10 -08:00
Ilia Mirkin	fa43c4bd99	nv50/ir: using sampleid/pos shouldn't force per-sample interpolation See https://www.khronos.org/bugzilla/show_bug.cgi?id=1462 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-05 23:26:03 -05:00
Ilia Mirkin	313205cb8f	st/mesa: don't force per-sample interp if only sampleid/pos are used The OES extensions clarify this behaviour to differentiate between per-sample invocation and per-sample interpolation. Using sampleid/pos will force per-sample invocation but not per-sample interpolation. See https://www.khronos.org/bugzilla/show_bug.cgi?id=1462 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-05 23:26:03 -05:00
Ilia Mirkin	dcbf8377be	swrast: fix GL_ANY_SAMPLES_PASSED values in Result Since commit `922be4eab`, the expectation is that the query result contains the correct value. Unfortunately swrast does not distinguish between GL_SAMPLES_PASSED and GL_ANY_SAMPLES_PASSED. As a result, we must fix up the query result in a post-draw fixup. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94274 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-03-05 23:25:52 -05:00
Jason Ekstrand	8502794c12	anv/pipeline: Handle null wm_prog_data in 3DSTATE_CLIP	2016-03-05 14:42:16 -08:00
Kristian Høgsberg Kristensen	7b348ab8a0	anv: Fix rebase error	2016-03-05 14:33:50 -08:00
Kristian Høgsberg Kristensen	34326f46df	anv: Turn pipeline cache on by default Move the environment variable check to cache creation time so we block both lookups and uploads if it's turned off.	2016-03-05 13:54:24 -08:00
Kristian Høgsberg Kristensen	f2b37132cb	anv: Check if shader if present before uploading to cache Between the initial check the returns NO_KERNEL and compiling the shader, other threads may have added the shader to the cache. Before uploading the kernel, check again (under the mutex) that the compiled shader still isn't present.	2016-03-05 13:54:24 -08:00
Kristian Høgsberg Kristensen	30bbe28b7e	anv: Always use point size from the shader There is no API for setting the point size and the shader is always required to set it. Section 24.4: "If the value written to PointSize is less than or equal to zero, or if no value was written to PointSize, results are undefined." As such, we can just always program PointWidthSource to Vertex. This simplifies anv_pipeline a bit and avoids trouble when we enable the pipeline cache and don't have writes_point_size in the prog_data.	2016-03-05 13:54:24 -08:00
Kristian Høgsberg Kristensen	6139fe9a77	anv: Also cache the struct anv_pipeline_binding maps This is state the we generate when compiling the shaders and we need it for mapping resources from descriptor sets to binding table indices.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	584f39c65e	anv: Don't re-upload shaders when merging Using anv_pipeline_cache_upload_kernel() will re-upload the kernel and prog_data when we merge caches. Since the kernel and prog_data is already in the program_stream, use anv_pipeline_cache_add_entry() instead to only add the entry to the hash table.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	626559ed37	anv: Add anv_pipeline_cache_add_entry() This function will grow the cache to make room and then add the entry.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	07441c344c	anv: Rename anv_pipeline_cache_add_entry() to 'set' This function is a helper that unconditionally sets a hash table entry and expects the cache to have enough room. Calling it 'add_entry' suggests it will grow the cache as needed.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	87967a2c85	anv: Simplify pipeline cache control flow a bit No functional change, but the control flow around searching the cache and falling back to compiling is a bit simpler.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	2b29342fae	anv: Store prog data in pipeline cache stream We have to keep it there for the cache to work, so let's not have an extra copy in struct anv_pipeline too.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	37c5e70253	anv: Rename 'table' to 'hash_table' in anv_pipeline_cache A little less ambiguous.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	c028ffea70	anv: Serialize as much pipeline cache as we can We can serialize as much as the application asks for and just stop once we run out of memory. This lets applications use a fixed amount of space for caching and still get some benefit.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	cd812f086e	anv: Use 1.0 pipeline cache header The final version of the pipeline cache header adds a few more fields.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	26ed943eb9	anv: Fix shader key hashing This was copied from inline code to a helper and wasn't updated to hash a pointer instead.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	3baf8af947	anv: Remove excess whitespace	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	ab36eae5e7	anv: Remove left-over bits of sparse-descriptor code	2016-03-05 13:50:07 -08:00
Jason Ekstrand	1afdfc3e6e	anv/pipeline: Implement the depth compare EQUAL workaround on gen8+	2016-03-05 09:59:28 -08:00
Jason Ekstrand	7c1660aa14	anv: Don't allow D16_UNORM to be combined with stencil Among other things, this can cause the depth or stencil test to spurriously fail when the fragment shader uses discard.	2016-03-05 09:59:28 -08:00
Jason Ekstrand	9a90176d48	anv/pipeline: Calculate the correct max_source_attr for 3DSTATE_SBE	2016-03-05 09:59:28 -08:00
Brian Paul	a4678311be	st/mesa: 78-column wrapping in st_extensions.c Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-05 09:21:05 -07:00
Brian Paul	9e6a6bd575	gallium/util: add new comments, assertions in u_debug_refcnt.c Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-05 09:20:34 -07:00
Brian Paul	b6a607b221	gallium/util: update comments and URL in u_debug_refcnt.c Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-05 09:20:28 -07:00
Brian Paul	cbca6964e2	gallium/util: make stream variable static in u_debug_refcnt.c Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-05 09:20:23 -07:00
Brian Paul	fb0abedce7	gallium/util: re-indent u_debug_refcnt.[ch] Wrap comments to 78 columns, etc. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-05 09:20:14 -07:00
Brian Paul	a7ba29f6d8	gallium/tests: silence warning in compute.c compute.c: In function ‘launch_grid’: compute.c:435:20: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default] info.input = input; ^ Maybe the pipe_grid_info::input field should be const void *? Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-05 09:15:44 -07:00
Timothy Arceri	31943e6ba5	glsl: replace remaining tabs in link_varyings.cpp Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2016-03-05 20:50:10 +11:00
Timothy Arceri	e2415e8467	glsl: replace remaining tabs in link_uniforms.cpp Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2016-03-05 20:50:05 +11:00
Jordan Justen	81f30e2f50	anv/hsw: Move query code to genX file for Haswell This fixes many CTS cases, but will require an update to the kernel command parser register whitelist. (The CS GPRs and TIMESTAMP registers need to be whitelisted.) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-05 01:08:07 -08:00
Timothy Arceri	3322cb7b8d	docs: mark align layout qualifier as DONE Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:39:13 +11:00
Timothy Arceri	037f68d81e	glsl: apply align layout qualifier rules to block offsets From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers) of the OpenGL 4.50 spec: "The align qualifier makes the start of each block member have a minimum byte alignment. It does not affect the internal layout within each member, which will still follow the std140 or std430 rules. The specified alignment must be a power of 2, or a compile-time error results. The actual alignment of a member will be the greater of the specified align alignment and the standard (e.g., std140) base alignment for the member's type. The actual offset of a member is computed as follows: If offset was declared, start with that offset, otherwise start with the next available offset. If the resulting offset is not a multiple of the actual alignment, increase it to the first offset that is a multiple of the actual alignment. This results in the actual offset the member will have. When align is applied to an array, it affects only the start of the array, not the array's internal stride. Both an offset and an align qualifier can be specified on a declaration. The align qualifier, when used on a block, has the same effect as qualifying each member with the same align value as declared on the block, and gets the same compile-time results and errors as if this had been done. As described in general earlier, an individual member can specify its own align, which overrides the block-level align, but just for that member. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:39:07 +11:00
Timothy Arceri	5a27fefffe	glsl: parse align layout qualifier Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:39:01 +11:00
Timothy Arceri	22b0082b9d	docs: mark explicit byte offsets as DONE Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:55 +11:00
Timothy Arceri	802262c0af	glsl: use explicit offset when lowering buffer access Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:49 +11:00
Timothy Arceri	96527c3cf2	glsl: copy explicit offset to uniform storage Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:44 +11:00
Timothy Arceri	e12a49ac12	glsl: update comment on offset field The old comment was for the location not the offset, we now use the field for block members so mention that also. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:39 +11:00
Timothy Arceri	9f24f42c49	glsl: add offset to glsl interface type In this patch we also copy the offset value from the ast and implement offset linking rules by adding it to the record_compare() function. From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers) of the GLSL 4.50 spec: "Two blocks linked together in the same program with the same block name must have the exact same set of members qualified with offset and their integral-constant-expression values must be the same, or a link-time error results." Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:34 +11:00
Timothy Arceri	8abed7f185	glsl: apply compile-time rules for the offset layout qualifier This implements the rules for the offset qualifier on block members. From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers) of the GLSL 4.50 spec: "The offset qualifier can only be used on block members of blocks declared with std140 or std430 layouts." ... "It is a compile-time error to specify an offset that is smaller than the offset of the previous member in the block or that lies within the previous member of the block." ... "The specified offset must be a multiple of the base alignment of the type of the block member it qualifies, or a compile-time error results." Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:30 +11:00
Timothy Arceri	6f45484ac7	glsl: enable offset layout qualifier for ARB_enhanced_layouts Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:26 +11:00
Timothy Arceri	1824ff1c2a	glsl: reject invalid input layout qualifiers Global in validation is already handled, this will do the validation for variables, blocks and block members. This fixes some CTS tests for the new enhanced layouts transform feedback qualifiers. V2: add some more valid input flags Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:07:09 +11:00
Timothy Arceri	bd53cc7b45	glsl: only apply default stream to output blocks This is needed to allow invalid qualifier checks on inputs. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:07:04 +11:00
Timothy Arceri	78d3098c05	glsl: rework parsing of blocks Previously interface blocks were giving the global default flags of uniform blocks. This meant we could not check for invalid qualifiers on interface blocks because they always contained invalid flags. This changes parsing so that interface blocks now get an empty set of layouts. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:07:00 +11:00
Timothy Arceri	d244986bf2	glsl: don't apply uniform/buffer layouts to interface blocks If the following patch we will stop setting these layouts by default on interface blocks, so we need to do this to avoid hitting the assert. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:06:56 +11:00
Nanley Chery	4e75f9b219	anv: Implement VK_REMAINING_{MIP_LEVELS,ARRAY_LAYERS} v2: Subtract the baseMipLevel and baseArrayLayer (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-04 21:25:23 -08:00
Kenneth Graunke	4ba7ad6cc1	i965: Only magnify depth for 3D textures, not array textures. When BaseLevel > 0, we magnify the dimensions to fill out the size of miplevels [0..BaseLevel). In particular, this was magnifying depth, thinking that the depth doubles at each level. This is perfectly reasonable for 3D textures, but dead wrong for array textures. Changing the depth != 1 condition to a target == GL_TEXTURE_3D check should make this only happen in the appropriate cases. Fixes about 32 dEQP tests: - dEQP-GLES31.functional.texture.gather.*.level_{1,2} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-03-04 21:25:08 -08:00
Jason Ekstrand	c1436e80ef	anv/meta_clear: Set the right number of dynamic states	2016-03-04 19:18:20 -08:00
Juan A. Suarez Romero	2f76a9924e	i965/vec4: add opportunistic behaviour to opt_vector_float() opt_vector_float() transforms several scalar MOV operations to a single vectorial MOV. This is done when those MOV covers all the components of the destination register. So something like: mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf3.0.z:D, 0D is transformed in: mov vgrf3.0:F, [0F, 0F, 0F, 1F] But there are cases where not all the components are written. For example, in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf4.0.xy:D, 1065353216D mov vgrf4.0.w:D, 0D mov vgrf6.0:UD, u4.xyzw:UD Nor vgrf3 nor vgrf4 .z components are written, so the optimization is not applied. But it could be applied anyway with the components covered, using a writemask to select the ones written. So we could transform it in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F] mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F] mov vgrf6.0:UD, u4.xyzw:UD This commit does precisely that: opportunistically apply opt_vector_float() when possible. total instructions in shared programs: 7124660 -> 7114784 (-0.14%) instructions in affected programs: 443078 -> 433202 (-2.23%) helped: 4998 HURT: 0 total cycles in shared programs: 64757760 -> 64728016 (-0.05%) cycles in affected programs: 1401686 -> 1371942 (-2.12%) helped: 3243 HURT: 38 v2: change vectorize_mov() signature (Matt). v3: take in account predicates (Juan). v4 [mattst88]: Update shader-db numbers. Fix some whitespace issues. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2016-03-04 19:16:52 -08:00
Jason Ekstrand	cc57efc67a	anv/pipeline: Fix depthBiasEnable on gen7 The first time I tried to fix this, I set the wrong fields.	2016-03-04 17:56:12 -08:00
Jason Ekstrand	653261285e	anv/cmd_buffer: Reset the state streams when resetting the command buffer	2016-03-04 17:54:29 -08:00
Jason Ekstrand	f700d16a89	anv/cmd_buffer: Include Haswell in set_subpass	2016-03-04 17:54:29 -08:00
George Kyriazis	feb71117ae	st/xlib: Don't destroy screen on XCloseDisplay() screen may still be used by other resources that are not yet freed. To correctly fix this there will be a need to account for resources differently, but this quick fix is not any worse than the original code that leaked screens anyway. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 18:14:46 -07:00
Nanley Chery	a6fb62a864	isl: Fix RenderTargetViewExtent for mipmapped 3D surfaces Match the comment stated above the assignment. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-04 13:20:44 -08:00
Nanley Chery	b80c8ebc45	isl: Get rid of isl_surf_fill_state_info::level0_extent_px This field is no longer needed. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-04 13:20:03 -08:00
Jason Ekstrand	d154a5ebd6	anv/cmd_buffer: Let the pipeline set StencilBufferWriteEnable on gen9	2016-03-04 12:23:01 -08:00
Jason Ekstrand	f374765ce6	anv/cmd_buffer: Mask stencil reference values	2016-03-04 12:22:32 -08:00
Jason Ekstrand	d61dcec64d	anv/clear: Pull the stencil write mask from the pipeline The stencil write mask wasn't getting set at all so we were using whatever write mask happend to be left over by the application.	2016-03-04 12:03:00 -08:00
Jason Ekstrand	ec18fef88d	anv/pipeline: Set StencilBufferWriteEnable from the pipeline The hardware docs say that StencilBufferWriteEnable should only be set if StencilTestEnable is set. It seems reasonable to set them together.	2016-03-04 12:03:00 -08:00
Jason Ekstrand	fcd8e57185	anv/pipeline: More competent gen8 clipping	2016-03-04 12:03:00 -08:00
Jason Ekstrand	a8afd29653	anv/pipeline: Use the right provoking vertex for triangle fans	2016-03-04 12:03:00 -08:00
Jason Ekstrand	fa8539dd6b	anv/pipeline: Respect pRasterizationState->depthBiasEnable	2016-03-04 12:03:00 -08:00
Matt Turner	1f862e923c	i965/fs: Optimize float conversions of byte/word extract. instructions in affected programs: 31535 -> 29966 (-4.98%) helped: 23 cycles in affected programs: 272648 -> 266022 (-2.43%) helped: 14 HURT: 1 The patch decreases the number of instructions in the two Unigine programs by: #1721: 4374 -> 4155 instructions (-5.01%) #1706: 3582 -> 3363 instructions (-6.11%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Matt Turner	905ff86198	nir: Recognize open-coded extract_u16. No shader-db changes, but does recognize some extract_u16 which enables the next patch to optimize some code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Matt Turner	76289fbfa8	nir: Recognize open-coded extract_u8. Two shaders that appear in Unigine benchmarks (Heaven and Valley) unpack three bytes from an integer and convert each into a float: float((val >> 16u) & 0xffu) float((val >> 8u) & 0xffu) float((val >> 0u) & 0xffu) Instead of shifting, masking, and type converting like this: shr(8) g15<1>UD g25<8,8,1>UD 0x00000010UD and(8) g16<1>UD g15<8,8,1>UD 0x000000ffUD mov(8) g17<1>F g16<8,8,1>UD shr(8) g18<1>UD g25<8,8,1>UD 0x00000008UD and(8) g19<1>UD g18<8,8,1>UD 0x000000ffUD mov(8) g20<1>F g19<8,8,1>UD and(8) g21<1>UD g25<8,8,1>UD 0x000000ffUD mov(8) g22<1>F g21<8,8,1>UD i965 can simply extract a byte and convert to float in a single instruction: mov(8) g17<1>F g25.2<32,8,4>UB mov(8) g20<1>F g25.1<32,8,4>UB mov(8) g22<1>F g25.0<32,8,4>UB This patch implements the first step: recognizing byte extraction. A later patch will optimize out the conversion to float. instructions in affected programs: 28568 -> 27450 (-3.91%) helped: 7 cycles in affected programs: 210076 -> 203144 (-3.30%) helped: 7 This patch decreases the number of instructions in the two Unigine programs by: #1721: 4520 -> 4374 instructions (-3.23%) #1706: 3752 -> 3582 instructions (-4.53%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Kenneth Graunke	9d7faadd8a	anv: Fix backwards shadow comparisons sample_c is backwards from what GL and Vulkan expect. See intel_state.c in i965. v2: Drop unused vk_to_gen_compare_op. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-04 11:35:46 -08:00
George Kyriazis	01e92e7010	st/xlib: Hang off screen destructor off main XCloseDisplay() callback. This resolves some order dependencies between the already existing callback the newly created one. Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 10:57:24 -07:00
George Kyriazis	51e562c3ea	st/xlib: Support unlimited number of display connections There is a limit of 10 display connections, which was a problem for apps/tests that were continuously opening/closing display connections. This fix uses XAddExtension() and XESetCloseDisplay() to keep track of the status of the display connections from the X server, freeing mesa-related data as X displays get destroyed by the X server. Poster child is the VTK "TimingTests" Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 10:57:09 -07:00
Brian Paul	192ee9adb1	svga: add new command-buffer-size HUD query To plot a graph of the command buffer size. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-04 07:57:41 -07:00
Brian Paul	1258f907f4	svga: add new svga_winsys_context::get_command_buffer_size() To ask how large the current command buffer is. Will be used for a new GALLIUM_HUD graph. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-04 07:57:41 -07:00
Brian Paul	6fc8d90fa9	svga: reorder SVGA_QUERY_ switch cases to match declaration order Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-04 07:57:41 -07:00
Sinclair Yeh	f1410c5b91	svga: Force an RGBA view creation for an RGBA resource glXCreatePixmap() may specify a GLX_TEXTURE_FORMAT_RGB_EXT format for an RGBA resource, causing us to create an RGBX view for an RGBA resource, a combination vgpu10 does not support. When this is detected, change the request to create an RGBA view instead. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 07:57:41 -07:00
Charmaine Lee	8366701f4c	svga: fix an error in svga_texture_generate_mipmap With this patch, make sure the shader resource view is properly created before referencing it in the generate mipmap command. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 07:57:41 -07:00
Thomas Hellstrom	395c7b8fa1	winsys/svga: Increase the fence timeout If running with a software renderer backend, the timeout may be insufficient, and we don't want to release busy buffers too early. In practice, SVGA gpu lockups are extremely rare. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-03-04 13:55:23 +01:00
Thomas Hellstrom	24ad7e16cd	winsys/svga: Fix an uninitialized return value Reported-by: Brian Paul <brianp@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviwed-by: Brian Paul <brianp@vmware.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-03-04 13:54:38 +01:00
Kenneth Graunke	9ec246796f	i965: Set MaxFramebufferWidth/Height to 16384, not viewport. dEQP-GLES31.functional.fbo.no_attachments.maximums.{all,height,size,width} started hitting assertion failures when emitting SURFACE_STATE, after commit `e8fd60e789` where Samuel increased the maximum viewport size to 32768, from 16384. MaxFramebufferWidth/Height were being set to the maximum viewport size, but are actually limited by the SURFACE_STATE width/height field range, which is 16384 on Gen7+ (where ARB_framebuffer_no_attachments is exposed). So, reduce these to 16384 explicitly. Fixes assert fails in the above mentioned dEQP tests. (Those tests still fail, however.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-03 21:31:22 -08:00
Francisco Jerez	a6046d217d	glsl: Improve the accuracy of the acos() approximation. The adjusted polynomial coefficients come from the numerical minimization of the L2 norm of the relative error. The old coefficients would give a maximum relative error of about 15000 ULP in the neighborhood around acos(x) = 0, the new ones give a relative error bounded by less than 2000 ULP in the same neighborhood. Fixes four dEQP subtests: dEQP-GLES31.functional.shaders.builtin_functions.precision.acos. highp_compute.{scalar,vec2,vec3,vec4} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	2795fbcae3	glsl: Parameterize asin_expr() on the fit coefficients. This will allow us to share the implementation while using different polynomials for asin() and acos(). Francisco Jerez did this in the SPIR-V front-end; I'm merely porting his idea to the GLSL world. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	aa37cbdff7	mesa: Allow Get() of several forgotten IsEnabled() pnames. From section 6.2 ("State Tables") of the GL 2.1 specification (the text also appears in the GL 3.0 and ES 3.1 specifications): "However, state variables for which IsEnabled is listed as the query command can also be obtained using GetBooleanv, GetIntegerv, GetFloatv, and GetDoublev." GL_DEBUG_OUTPUT, GL_DEBUG_OUTPUT_SYNCHRONOUS, and GL_FRAGMENT_SHADER_ATI were missing from the glGet() functions. All other IsEnabled() pnames look to be present, as far as I can tell. Fixes 8 dEQP-GLES31.functional.debug.state_query subtests: debug_output[_synchronous]_get{boolean,float,integer,integer64}. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	b4b50b074b	mesa: Make glGet queries initialize ctx->Debug when necessary. dEQP-GLES31.functional.debug.state_query.debug_group_stack_depth_* tries to call glGet on GL_DEBUG_GROUP_STACK_DEPTH right away, before doing any other debug setup. This should return 1. However, because ctx->Debug wasn't allocated, we bailed and returned 0. This patch removes the open-coded locking and switches the two glGet functions to use _mesa_lock_debug_state(), which takes care of allocating and initializing that state on the first time. It also conveniently takes care of unlocking on failure for us, so we don't need to handle that in every caller. Fixes dEQP-GLES31.functional.debug.state_query.debug_group_stack_depth_ {getboolean,getfloat,getinteger,getinteger64}. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	3ed260f54c	hack to make dota 2 menus work	2016-03-03 16:21:09 -08:00
Jason Ekstrand	56ba13c994	isl/surface_state: Set L2 bypass disable for certain BC* formats	2016-03-03 16:16:57 -08:00
Eduardo Lima Mitev	47392011c0	Update docs to advertise new support for ARB_internalformat_query2 Support in Mesa main and i965 has just been added. v2: Include note in 'New Features' of docs/relnotes/11.3.0.html. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-03 22:19:35 +01:00
Kenneth Graunke	623ce595a9	anv: Compile shader stages in pipeline order. Instead of the arbitrary order modules might be specified in. Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:36:19 -08:00
Nanley Chery	8dddc3fb1e	anv/meta: Delete unused functions Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:26:44 -08:00
Nanley Chery	d20f6abc85	anv/meta: Use blitter API for state-handling in Buffer Update/Copy Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:26:42 -08:00
Nanley Chery	318b67d157	anv/meta: Use blitter API in do_buffer_copy() v2: Keep pitch in units of bytes (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:26:36 -08:00
Nanley Chery	96ff4d0679	anv/meta: Use blitter API in anv_CmdCopyImage() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:26:35 -08:00
Nanley Chery	9b6c95d46e	anv/meta: Use blitter API for copies between Images and Buffers Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:25:20 -08:00
Nanley Chery	91640c34c6	anv/meta: Add function which copies between Buffers and Images v2: Keep pitch in units of bytes (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:25:15 -08:00
Nanley Chery	61ad78d0d1	anv/meta: Add function to create anv_meta_blit2d_surf from anv_image v2: Keep pitch in units of bytes (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:25:10 -08:00
Nanley Chery	2e9b08b9b8	anv/meta: Implement the blitter API functions Most of the code in anv_meta_blit2d() is borrowed from do_buffer_copy(). Create an image and image view for each rectangle. Note: For tiled RGB images, ISL will align the image's row_pitch up to the nearest tile width. v2 (Jason): Keep pitch in units of bytes Make src_format and dst_format variables s/dest/dst/ in every usage v3: Fix dst_image width Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:25:04 -08:00
Nanley Chery	032bf172b4	anv/meta: Modify blitter API fields Some fields are unnecessary. The variables "pitch" and "bs" are used for consistency with ISL. v2: Keep pitch in units of bytes (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:24:53 -08:00
Jason Ekstrand	654f79a045	anv/meta: Add the beginnings of a blitter API This API is designed to be an abstraction that sits between the VkCmdCopy commands and the hardware. The idea is that it is simple enough that it should be implementable using the blitter but with enough extra data that we can implement it with the 3-D pipeline efficiently. One design objective is to allow the user to supply enough information that we can handle most blit operations with a single draw call even if they require copying multiple rectangles.	2016-03-03 11:24:45 -08:00
Nanley Chery	d1e48b9945	anv/meta: Remove redundancies in do_buffer_copy() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:24:42 -08:00
Nanley Chery	cfe7036750	anv/meta: Replace copy_format w/ block size in do_buffer_copy() This is a preparatory commit that will simplify the future usage of this function. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:24:38 -08:00
Nanley Chery	d50ff250ec	anv/meta: Add missing command to exit meta in anv_CmdUpdateBuffer() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:24:21 -08:00
Nanley Chery	1d9d90d9a6	anv/image: Create a linear image when requested If a linear image is requested, the only possible result should be a linearly-tiled surface. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:24:17 -08:00
Nanley Chery	091f1da902	isl: Don't filter tiling flags if a specific tiling bit is set If a specific bit is set, the intention to create a surface with a specific tiling format should be respected. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:23:40 -08:00
Nanley Chery	456f5b0314	isl: Add function to get intratile offsets from x/y offsets Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 10:56:15 -08:00
Jason Ekstrand	206414f92e	anv/util: Fix vector resizing It wasn't properly handling the fact that wrap-around in the source may not translate to wrap-around in the destination. This really needs unit tests.	2016-03-03 08:17:36 -08:00
Antia Puentes	4f028bfcc0	i965: Enable the ARB_internalformat_query2 extension Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:08 +01:00
Eduardo Lima Mitev	cbbdf8612d	i965/formatquery: Add support for INTERNALFORMAT_PREFERRED query This pname is tricky. The spec states that an internal format should be returned, that is compatible with the passed internal format, and has at least the same precision. There is no clear API to resolve this. The closest we have (and what other drivers (i.e, NVidia proprietary) do, is to return the same internal format given as parameter. But we validate first that the passed internal format is supported by i965. To check for support, we have the TextureFormatSupported map'. But this map expects a 'mesa_format', which takes a format+typen. So, we must first "come up" with a generic type that is suited for this internal format, then get a mesa_format, and then do the validation. The cleanest solution here is to add a method that does exactly what the spec wants: a driver's preferred internal format from a given internal format. But at this point we lack a clear view of what defines this preference, and also there seems to be no API for it. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:08 +01:00
Eduardo Lima Mitev	e064f43485	mesa/glformats: Consider DEPTH/STENCIL when resolving a mesa_format _mesa_format_from_format_and_type() is currently not considering DEPTH and STENCIL formats, which are not array formats and are not handled anywhere. This patch adds cases for common combinations of DEPTH/STENCIL format and types. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	ec299602a6	mesa/formatquery: Add (GET_)TEXTURE_IMAGE_TYPE pnames These basically reuse the default implementation of GL_READ_PIXELS_TYPE. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	23f94146c9	mesa/formatquery: Add (GET_)TEXTURE_IMAGE_FORMAT pnames Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	020671f2a3	mesa/formatquery: Add READ_PIXELS_TYPE pname We call the driver to provide its preferred type, but also provide a default implementation that selects a generic type based on the passed internal format. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	bec286f724	mesa/formatquery: Add READ_PIXELS_FORMAT pname Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	09550c16a5	mesa/formatquery: Add support for READ_PIXELS query This is supported since very early version of OpenGL, but we still call the driver to give it the opportunity to report caveat or no support. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Alejandro Piñeiro	8d7696f638	mesa/formatquery: added FILTER pname support It discards out the targets and internalformats that explicitly mention (per-spec) that doesn't support filter types other than NEAREST or NEAREST_MIPMAP_NEAREST. Those are: * Texture buffers target * Multisample targets * Any integer internalformat For the case of multisample targets, it was used the existing method _mesa_target_allows_setting_sampler_parameter. This would scalate better in the future if new targets appear that doesn't allow to set sampler parameters. We consider RENDERBUFFER to support LINEAR filters, because although it doesn't support this filter for sampling, you can set LINEAR on a blit operation using glBlitFramebuffer. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Alejandro Piñeiro	a8736a2567	mesa/texparam: make public target_allows_setting_sampler_parameters In order to allow to be used on ARB_internalformat_query2 implementation. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	e8ab7727e1	mesa/formatquery: Added framebuffer renderability related queries From the ARB_internalformat_query2 specification: "- FRAMEBUFFER_RENDERABLE: The support for rendering to the resource via framebuffer attachment is returned in <params>. - FRAMEBUFFER_RENDERABLE_LAYERED: The support for layered rendering to the resource via framebuffer attachment is returned in <params>. - FRAMEBUFFER_BLEND: The support for rendering to the resource via framebuffer attachment when blending is enabled is returned in <params>." For all of them, "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource is unsupported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	b4ee9f56fd	mesa/formatquery: Added texture gather/shadow related queries From the ARB_internalformat_query2 specification: "- TEXTURE_SHADOW: The support for using the resource with shadow samplers is written to <params>. - TEXTURE_GATHER: The support for using the resource with texture gather operations is written to <params>. - TEXTURE_GATHER_SHADOW: The support for using resource with texture gather operations with shadow samplers is written to <params>." For all of them, "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	557939c08f	mesa/formatquery: Added texture view related queries From the ARB_internalformat_query2 specification: "- TEXTURE_VIEW: The support for using the resource with the TextureView command is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned. - VIEW_COMPATIBILITY_CLASS: The compatibility class of the resource when used as a texture view is returned in <params>. The compatibility class is one of the values from the /Class/ column of Table 3.X.2. If the resource has no other formats that are compatible, the resource does not support views, or if texture views are not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	04e2e0b24a	mesa/textureview: Make _lookup_view_class public It will be used by the ARB_internalformat_query2 implementation to implement the VIEW_COMPATIBILITY_CLASS <pname> query. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	2066c7be61	mesa/formatquery: Added CLEAR_BUFFER <pname> query From the ARB_internalformat_query2 specification: "- CLEAR_BUFFER: The support for using the resource with ClearBuffer*Data commands is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	aed633bb97	mesa/formatquery: Added compressed texture related queries From the ARB_internalformat_query2 specification: "- TEXTURE_COMPRESSED: If <internalformat> is a compressed format that is supported for this type of resource, TRUE is returned in <params>. If the internal format is not compressed, or the type of resource is not supported, FALSE is returned. - TEXTURE_COMPRESSED_BLOCK_WIDTH: If the resource contains a compressed format, the width of a compressed block (in bytes) is returned in <params>. If the internal format is not compressed, or the resource is not supported, 0 is returned. - TEXTURE_COMPRESSED_BLOCK_HEIGHT: If the resource contains a compressed format, the height of a compressed block (in bytes) is returned in <params>. If the internal format is not compressed, or the resource is not supported, 0 is returned. - TEXTURE_COMPRESSED_BLOCK_SIZE: If the resource contains a compressed format the number of bytes per block is returned in <params>. If the internal format is not compressed, or the resource is not supported, 0 is returned. (combined with the above, allows the bitrate to be computed, and may be useful in conjunction with ARB_compressed_texture_pixel_storage)." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	467f462c75	mesa/formatquery: Added simultaneous texture and depth/stencil queries From the ARB_internalformat_query2 specification: "- SIMULTANEOUS_TEXTURE_AND_DEPTH_TEST: The support for using the resource both as a source for texture sampling while it is bound as a buffer for depth test is written to <params>. For example, a depth (or stencil) texture could be bound simultaneously for texturing while it is bound as a depth (and/or stencil) buffer without causing a feedback loop, provided that depth writes are disabled. - SIMULTANEOUS_TEXTURE_AND_STENCIL_TEST: The support for using the resource both as a source for texture sampling while it is bound as a buffer for stencil test is written to <params>. For example, a depth (or stencil) texture could be bound simultaneously for texturing while it is bound as a depth (and/or stencil) buffer without causing a feedback loop, provided that stencil writes are disabled. - SIMULTANEOUS_TEXTURE_AND_DEPTH_WRITE: The support for using the resource both as a source for texture sampling while performing depth writes to the resources is written to <params>. For example, a depth-stencil texture could be bound simultaneously for stencil texturing while it is bound as a depth buffer. Feedback loops cannot occur because sampling a stencil texture only returns the stencil portion, and thus writes to the depth buffer do not modify the stencil portions. - SIMULTANEOUS_TEXTURE_AND_STENCIL_WRITE: The support for using the resource both as a source for texture sampling while performing stencil writes to the resources is written to <params>. For example, a depth-stencil texture could be bound simultaneously for depth-texturing while it is bound as a stencil buffer. Feedback loops cannot occur because sampling a depth texture only returns the depth portion, and thus writes to the stencil buffer could not modify the depth portions. For all of them, "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	bd45fb3de4	mesa/formatquery: Added queries related to image textures From the ARB_internalformat_query2 specification: "- IMAGE_TEXEL_SIZE: The size of a texel when the resource when used as an image texture is returned in <params>. This is the value from the /Size/ column in Table 3.22. If the resource is not supported for image textures, or if image textures are not supported, zero is returned. - IMAGE_COMPATIBILITY_CLASS: The compatibility class of the resource when used as an image texture is returned in <params>. This corresponds to the value from the /Class/ column in Table 3.22. The possible values returned are IMAGE_CLASS_4_X_32, IMAGE_CLASS_2_X_32, IMAGE_CLASS_1_X_32, IMAGE_CLASS_4_X_16, IMAGE_CLASS_2_X_16, IMAGE_CLASS_1_X_16, IMAGE_CLASS_4_X_8, IMAGE_CLASS_2_X_8, IMAGE_CLASS_1_X_8, IMAGE_CLASS_11_11_10, and IMAGE_CLASS_10_10_10_2, which correspond to the 4x32, 2x32, 1x32, 4x16, 2x16, 1x16, 4x8, 2x8, 1x8, the class (a) 11/11/10 packed floating-point format, and the class (b) 10/10/10/2 packed formats, respectively. If the resource is not supported for image textures, or if image textures are not supported, NONE is returned. - IMAGE_PIXEL_FORMAT: The pixel format of the resource when used as an image texture is returned in <params>. This is the value from the /Pixel format/ column in Table 3.22. If the resource is not supported for image textures, or if image textures are not supported, NONE is returned. - IMAGE_PIXEL_TYPE: The pixel type of the resource when used as an image texture is returned in <params>. This is the value from the /Pixel type/ column in Table 3.22. If the resource is not supported for image textures, or if image textures are not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	990a7200e0	mesa/shaderimage: Added func to get the GL_IMAGE_CLASS from the format It will be used by the ARB_internalformat_query2 implementation to implement the IMAGE_COMPATIBILITY_CLASS <pname> query. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	52c3692324	mesa/formatquery: Added SHADER_IMAGE_{LOAD,STORE,ATOMIC} <pname> queries From the ARB_internalformat_query2 specification: "- SHADER_IMAGE_LOAD: The support for using the resource with image load operations in shaders is written to <params>. In this case the <internalformat> is the value of the <format> parameter that would be passed to BindImageTexture. - SHADER_IMAGE_STORE: The support for using the resource with image store operations in shaders is written to <params>. In this case the <internalformat> is the value of the <format> parameter that is passed to BindImageTexture. - SHADER_IMAGE_ATOMIC: The support for using the resource with atomic memory operations from shaders is written to <params>." For all of them: "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	876f7a7c08	mesa/shaderimage: Make is_image_format_supported public It will be used by the ARB_internalformat_query2 implementation to implement queries related to the ARB_shader_image_load_store extension. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	fae2b10ff9	mesa/formatquery: Added queries related to texture sampling in shaders From the ARB_internalformat_query2 specification: "- VERTEX_TEXTURE: The support for using the resource as a source for texture sampling in a vertex shader is written to <params>. - TESS_CONTROL_TEXTURE: The support for using the resource as a source for texture sampling in a tessellation control shader is written to <params>. - TESS_EVALUATION_TEXTURE: The support for using the resource as a source for texture sampling in a tessellation evaluation shader is written to <params>. - GEOMETRY_TEXTURE: The support for using the resource as a source for texture sampling in a geometry shader is written to <params>. - FRAGMENT_TEXTURE: The support for using the resource as a source for texture sampling in a fragment shader is written to <params>. - COMPUTE_TEXTURE: The support for using the resource as a source for texture sampling in a compute shader is written to <params>." For all of them, "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	aeb759c7d6	mesa/formatquery: Added SRGB_DECODE_ARB <pname> query From the ARB_internalformat_query2 specification: "- SRGB_DECODE_ARB: The support for toggling whether sRGB decode happens at sampling time (see EXT/ARB_texture_sRGB_decode) for the resource is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	bcb2f9cdb9	mesa/formatquery: Added SRGB_{READ,WRITE} <pname> queries From the ARB_internalformat_query2 specification: "- SRGB_READ: The support for converting from sRGB colorspace on read operations (see section 3.9.18) from the resource is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned. - SRGB_WRITE: The support for converting to sRGB colorspace on write operations to the resource is returned in <params>. This indicates that writing to framebuffers with this internalformat will encode to sRGB color spaces when FRAMEBUFFER_SRGB is enabled (see section 4.1.8). Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	e88cbb7a51	mesa/formatquery: Added COLOR_ENCODING <pname> query. From the ARB_internalformat_query2 specification: "- COLOR_ENCODING: The color encoding for the resource is returned in <params>. Possible values for color buffers are LINEAR or SRGB, for linear or sRGB-encoded color components, respectively. For non-color formats (such as depth or stencil), or for unsupported resources, the value NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	b1755535ec	mesa/glformats: Add a helper function _mesa_is_srgb_format() Returns true if the passed format is an sRGB format, false otherwise. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	87b2de3998	mesa/formatquery: Added mipmap related <pname> queries Specifically MIPMAP, MANUAL_GENERATE_MIPMAP and AUTO_GENERATE_MIPMAP <pname> queries. From the ARB_internalformat_query2 specification: "- MIPMAP: If the resource supports mipmaps, TRUE is returned in <params>. If the resource is not supported, or if mipmaps are not supported for this type of resource, FALSE is returned. - MANUAL_GENERATE_MIPMAP: The support for manually generating mipmaps for the resource is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource is not supported, or if the operation is not supported, NONE is returned. - AUTO_GENERATE_MIPMAP: The support for automatic generation of mipmaps for the resource is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource is not supported, or if the operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	079d99b830	mesa/genmipmap: Added a function to validate the internalformat It will be used by the ARB_internalformat_query2 implementation to implement mipmap related queries. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	06852f4b7a	mesa/genmipmap: Added a function to check if the target is valid It will be used by the ARB_internalformat_query2 implementation to implement mipmap related queries. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	df3a37311d	mesa/formatquery: Added {COLOR,DEPTH,STENCIL}_RENDERABLE <pname> queries Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	c22ceb08bb	mesa/formatquery: Added {COLOR,DEPTH,STENCIL}_COMPONENTS <pname> queries Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	e976a30db8	mesa/formatquery: support for MAX_COMBINED_DIMENSIONS It is implemented combining the values returned by calls to the 32-bit query _mesa_GetInternalformati32v. The main reason is simplicity. The other option would be C&P how we implemented the support of GL_MAX_{WIDTH/HEIGHT/DEPTH} and GL_SAMPLES. Additionally, doing this way, we avoid adding checks on the code, as are done by the call to the query itself. MAX_COMBINED_DIMENSIONS is the only pname pointed on the spec of needing a 64-bit query. We handle that possibility by packing the returning value on the two first 32-bit integers of params. This would work on the 32-bit query as far as the value is not greater that INT_MAX. On the 64-bit query wrapper we unpack those values in order to get the final value. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Eduardo Lima Mitev	c5cf16a4fc	mesa/teximage: add _mesa_is_cube_map_texture utility method Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	4e33278b39	main/formatquery: support for MAX_{WIDTH/HEIGHT/DEPTH/LAYERS} Implemented by calling GetIntegerv with the equivalent pname and handling individually the exceptions related to dimensions. All those pnames are used to get the maximum value for each dimension of the given target. The only difference between this calls and calling GetInteger with pnames like GL_MAX_TEXTURE_SIZE, GL_MAX_3D_TEXTURE_SIZE, etc is that GetInternalformat allows to specify a internalformat. But at this moment, there is no reason to think that the values would be different based on the internalformat. The spec already take that into account, using these specific pnames as example on Issue 7 of arb_internalformat_query2 spec. So this seems like a hook to allow to return different values based on the internalformat in the future. It is worth to note that the piglit test associated to those pnames are checking the returned values of GetInternalformat against the values returned by GetInteger, and the test is passing with NVIDIA proprietary drivers. main/formatquery: support for MAX_{WIDTH/HEIGHT/DEPTH/LAYERS} Implemented by calling GetIntegerv with the equivalent pname and handling individually the exceptions related to dimensions. All those pnames are used to get the maximum value for each dimension of the given target. The only difference between this calls and calling GetInteger with pnames like GL_MAX_TEXTURE_SIZE, GL_MAX_3D_TEXTURE_SIZE, etc is that GetInternalformat allows to specify a internalformat. But at this moment, there is no reason to think that the values would be different based on the internalformat. The spec already take that into account, using these specific pnames as example on Issue 7 of arb_internalformat_query2 spec. So this seems like a hook to allow to return different values based on the internalformat in the future. It is worth to note that the piglit test associated to those pnames are checking the returned values of GetInternalformat against the values returned by GetInteger, and the test is passing with NVIDIA proprietary drivers. v2: use _mesa_has## instead of direct ctx->Extensions access (Nanley Chery) Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	b750144b0a	mesa/formatquery: support for IMAGE_FORMAT_COMPATIBILITY_TYPE From arb_internalformat_query2 spec: "IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the resource when used as an image textures is returned in <params>. This is equivalent to calling GetTexParameter with <value> set to IMAGE_FORMAT_COMPATIBILITY_TYPE." Current implementation of GetTexParameter for this case returns a field of a texture object, so the support of this pname was implemented creating a temporal texture object and returning that value. It is worth to mention that right now that field is not reassigned after initialization. So it is somehow hardcoded. An alternative option would be return that value. That doesn't seems really scalable though. v2: use _mesa_has## instead of direct ctx->Extensions access (Nanley Chery) Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	e98a3c799f	mesa/formatquery: handle unmodified buffer for SAMPLES on the 64-bit query From arb_internalformat_query2 spec: " If <internalformat> is not color-renderable, depth-renderable, or stencil-renderable (as defined in section 4.4.4), or if <target> does not support multiple samples (ie other than TEXTURE_2D_MULTISAMPLE, TEXTURE_2D_MULTISAMPLE_ARRAY, or RENDERBUFFER), <params> is not modified." So there are cases where the buffer should not be modified. As the 64-bit query is a wrapper over the 32-bit query, we can't just copy the values to the equivalent 32-bit buffer, as that would fail if the original params contained values greater that INT_MAX. So we need to copy-back only the values that got modified by the 32-bit query. We do that by filling the temporal buffer by negatives, as the 32-bit query should not return negative values ever. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	580816b747	mesa/formatquery: initial implementation for GetInternalformati64v It just does a wrapping on the existing 32-bit GetInternalformativ. We will maintain the 32-bit query as default as it is likely that it would be the one most used. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	7241e1b5f4	mesa/formatquery: Added INTERNALFORMAT_{X}_{SIZE,TYPE} <pname> queries From the ARB_internalformat_query2 spec: "- INTERNALFORMAT_RED_SIZE - INTERNALFORMAT_GREEN_SIZE - INTERNALFORMAT_BLUE_SIZE - INTERNALFORMAT_ALPHA_SIZE - INTERNALFORMAT_DEPTH_SIZE - INTERNALFORMAT_STENCIL_SIZE - INTERNALFORMAT_SHARED_SIZE For uncompressed internal formats, queries of these values return the actual resolutions that would be used for storing image array components for the resource. For compressed internal formats, the resolutions returned specify the component resolution of an uncompressed internal format that produces an image of roughly the same quality as the compressed algorithm. For textures this query will return the same information as querying GetTexLevelParameter{if}v for TEXTURE__SIZE would return. If the internal format is unsupported, or if a particular component is not present in the format, 0 is written to <params>. - INTERNALFORMAT_RED_TYPE - INTERNALFORMAT_GREEN_TYPE - INTERNALFORMAT_BLUE_TYPE - INTERNALFORMAT_ALPHA_TYPE - INTERNALFORMAT_DEPTH_TYPE - INTERNALFORMAT_STENCIL_TYPE For uncompressed internal formats, queries for these values return the data type used to store the component. For compressed internal formats the types returned specify how components are interpreted after decompression. For textures this query returns the same information as querying GetTexLevelParameter{if}v for TEXTURE_TYPE would return. Possible values return include, NONE, SIGNED_NORMALIZED, UNSIGNED_NORMALIZED, FLOAT, INT, UNSIGNED_INT, representing missing, signed normalized fixed point, unsigned normalized fixed point, floating-point, signed unnormalized integer and unsigned unnormalized integer components. NONE is returned for all component types if the format is unsupported." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	675418182b	mesa/main: Extend _mesa_get_format_bits to accept new pnames The new pnames accepted by the function are: - INTERNALFORMAT_RED_SIZE - INTERNALFORMAT_GREEN_SIZE - INTERNALFORMAT_BLUE_SIZE - INTERNALFORMAT_ALPHA_SIZE - INTERNALFORMAT_DEPTH_SIZE - INTERNALFORMAT_STENCIL_SIZE It will be used by the ARB_internalformat_query2 implementation to implement those pnames. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	4a8dae6247	mesa/main: Extend _mesa_base_format_has_channel to accept new pnames The new pnames accepted by the function are: - INTERNALFORMAT_RED_SIZE - INTERNALFORMAT_GREEN_SIZE - INTERNALFORMAT_BLUE_SIZE - INTERNALFORMAT_ALPHA_SIZE - INTERNALFORMAT_DEPTH_SIZE - INTERNALFORMAT_STENCIL_SIZE - INTERNALFORMAT_RED_TYPE - INTERNALFORMAT_GREEN_TYPE - INTERNALFORMAT_BLUE_TYPE - INTERNALFORMAT_ALPHA_TYPE - INTERNALFORMAT_DEPTH_TYPE - INTERNALFORMAT_STENCIL_TYPE It will be used by the ARB_internalformat_query2 implementation to implement those pnames. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	f1c789fa00	mesa/main: Make legal_get_tex_level_parameter_target public It will be used by the ARB_internalformat_query2 implementation to check if the target is valid for those <pnames> that are said in the spec that should return the same values than the 'glGetTexLevelParameter{if}v' function: - INTERNALFORMAT_RED_SIZE - INTERNALFORMAT_GREEN_SIZE - INTERNALFORMAT_BLUE_SIZE - INTERNALFORMAT_ALPHA_SIZE - INTERNALFORMAT_DEPTH_SIZE - INTERNALFORMAT_STENCIL_SIZE - INTERNALFORMAT_SHARED_SIZE - INTERNALFORMAT_RED_TYPE - INTERNALFORMAT_GREEN_TYPE - INTERNALFORMAT_BLUE_TYPE - INTERNALFORMAT_ALPHA_TYPE - INTERNALFORMAT_DEPTH_TYPE - INTERNALFORMAT_STENCIL_TYPE - IMAGE_FORMAT_COMPATIBILITY_TYPE Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Eduardo Lima Mitev	eacb2c971e	mesa/formatquery: Added INTERNALFORMAT_PREFERRED pname Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	56ec2dfcb1	mesa/formatquery: Added the INTERNALFORMAT_SUPPORTED <pname> query Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	4722abc630	mesa/formatquery: Added a func to check <internalformat> supported From the ARB_internalformat_query2 specification: "The INTERNALFORMAT_SUPPORTED <pname> can be used to determine if the internal format is supported, and the other <pnames> are defined in terms of whether or not the format is supported." v2: Consider also FBO base formats when checking if the internalformat is supported. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	5f6e3a0370	mesa/formatquery: Added func to check if the 'resource' is supported Checks that the 'resource', as defined by the ARB_internalformat_query2 specification, is supported by the implementation for those 'pnames' that require this check. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	95392cfa9d	mesa/main: not fill mesa_error on _mesa_legal_texture_base_format_for_target This would allow to use this method if you are just querying if it is allowed, like for arb_internalformat_query2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	aaf5ad513b	mesa/teximage: Make _mesa_format_no_online_compression public It will be used by the ARB_internalformat_query2 implementation to check if a certain compressed 'internalformat' is supported by texture 'targets'. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	5eef355823	mesa/teximage: make public is_renderable_texture_format It will be used by the ARB_internalformat_query2 implementation to check if the 'internalformat' passed is supported by texture MULTISAMPLE 'targets'. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	b5d27bc5dd	mesa/main: Added empty skeleton of glGetInternalformati64v Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	2453bba504	mesa: Add dispatch and extension XML for GL_ARB_internalformat_query2 Equivalent to commit bda540 (that added GL_ARB_internalformat_query) v2: include the new xml to to API_XML list at Makefile.am (Emil Velikov) Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	d432337e2d	mesa/formatquery: Added boilerplate code to extend GetInternalformativ The goal is to extend the GetInternalformativ query to implement the ARB_internalformat_query2 specification, keeping the behaviour defined by the ARB_internalformat_query if ARB_internalformat_query2 is not supported. v2: Don't require ARB_internalformat_query when profile is GLES3. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Antia Puentes	806bc2bf22	mesa/formatquery: Added a func to check if the <target> is supported From the ARB_internalformat_query2 spec: "If the particular <target> and <internalformat> combination do not make sense, or if a particular type of <target> is not supported by the implementation the "unsupported" answer should be given. This is not an error." This function checks if the <target> is supported by the implementation. v2: Allow RENDERBUFFER targets also on GLES 3 profiles. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Antia Puentes	4af3e5e9f1	mesa/formatquery: Added function to set 'unsupported' responses The ARB_internalformat_query2 specification defines which is the reponse best representing "not supported" or "not applicable" for each <pname>. Queries for unsupported features, targets, internalformats, combinations of: target and internalformat, target and pname, pname and internalformat, do not return an error but the corresponding 'unsupported' response. We will use that response as the default answer. For SAMPLES the 'unsupported' response is to not modify the 'params' buffer. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Antia Puentes	a6434f41cc	mesa/formatquery: Added function to validate parameters Handles the cases where an error should be returned according to the ARB_internalformat_query and ARB_internalformat_query2 specifications. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Antia Puentes	b89463cdfd	mesa/main: Add extension tracking bit for ARB_internalformat_query2 Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	a347a0f53f	mesa: Completely remove QuerySamplesForFormat from driver func table At this point, all uses have been replaced by the more general hook QueryInternalFormat, introduced by ARB_internalformat_query2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	993d7345b7	mesa/formatquery: Use new driver hook QueryInternalFormat Implements SAMPLES and NUM_SAMPLE_COUNTS queries using the new generic driver call QueryInternalFormat, which is being introduced as replacement of QuerySamplesForFormat to support ARB_internalformat_query2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	25ee5c60dc	mesa/formatquery: Remove tracking of number of elements in the response Currently, the number of integers returned in the response to GetInternalFormativ is being tracked by a 'count' variable. This is so only the modified elements from the temporary buffer are copied into the original user buffer. However, with the introduction of ARB_internalformat_query2, keeping track of 'count' would complicate the code a lot, considering the high number of queries. So, we propose to forget about tracking count, and move all the 16 elements in the temporary buffer, back to the user buffer (clamped to user buffer size of course). This is basically a trade-off between performance and code clarity. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	1f0b2ce8ec	mesa/multisample: Check sample count using the new driver hook Use QueryInternalFormat instead of QuerySamplesForFormat to obtain the highest supported sample. QuerySamplesForFormat is to be removed. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	ee31b0b1d0	st/format: Replace QuerySamplesForFormat by new QueryInternalFormat hook The previous code for SAMPLES and NUM_SAMPLE_COUNTS is reused as a private function. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	82be7735f3	i965/formatquery: Respond queries SAMPLES and NUM_SAMPLE_COUNTS This effectively disables old QuerySamplesForFormat driver hook, since it is never called by Mesa anymore. v2: Call brw_query_samples_for_format() with a dummy buffer to calculate num samples, to avoid modifying the original buffer. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	2dabff9068	i965: Move brw_query_samples_for_format() to brw_queryformat.c Now that there is a dedicated source file for internal format queries, this function belongs there. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	28144c4476	i965: Add boilerplate function for QueryInternalFormat driver hook By default, we call back the driver's hook fallback function that has generic implementations for the all the queries. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	45054f9702	mesa: Add a default QueryInternalFormat() function for drivers This is a fallback function for drivers not implementing ARB_internalformat_query2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	93d30c3de9	mesa: Add QueryInternalFormat to device driver virtual table This new function queries different driver parameters for a particular target and texture format. It is basically a driver hook to support ARB_internalformat_query2. Since ARB_internalformat_query2 introduced several new query parameters over ARB_internalformat_query, having one driver hook for each parameter is no longer feasible. So this is the generic entry-point for calls to glGetInternalFormativ and glGetInternalFormati64v. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Iago Toral Quiroga	283c8372cb	glsl/opt_array_splitting: Fix indentation Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-03 09:12:41 +01:00
Iago Toral Quiroga	4a60002424	glsl/opt_array_splitting: Fix crash when doing array indexing into other arrays When we find indirect indexing into an array, the current implementation of the array spliiting optimization pass does not look further into the expression tree. However, if the variable expression involves variable indexing into other arrays, we can miss that these other arrays also have variable indexing. If that happens, the pass will crash later on after hitting an assertion put there to ensure that split arrays are in fact always indexed via constants: shader_runner: opt_array_splitting.cpp:296: void ir_array_splitting_visitor::split_deref(ir_dereference**): Assertion `constant' failed. This patch fixes the problem by letting the pass step into the variable index expression to identify these cases properly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89607 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-03 09:02:30 +01:00
Oded Gabbay	914d4967d7	radeonsi: Do colorformat endian swap for PIPE_USAGE_STAGING There is an old if statement (dated to 2011) that prevented doing endian swap for colorformat, in case the buffer is marked as PIPE_USAGE_STAGING. This is now wrong because st_ReadPixels() reads into a destination texture that is marked with PIPE_USAGE_STAGING. Therefore, even if the texture is rendered correctly to the monitor, when reading it back we get unswapped/wrong values. This patch makes the check_rgba() function in gl-1.0-readpixsanity piglit test pass in big-endian. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-03 09:20:08 +02:00
Oded Gabbay	ef5183faea	r600g: Do colorformat endian swap for PIPE_USAGE_STAGING There is an old if statement (dated to 2011) that prevented doing endian swap for colorformat, in case the buffer is marked as PIPE_USAGE_STAGING. This is now wrong because st_ReadPixels() reads into a destination texture that is marked with PIPE_USAGE_STAGING. Therefore, even if the texture is rendered correctly to the monitor, when reading it back we get unswapped/wrong values. This patch makes the check_rgba() function in gl-1.0-readpixsanity piglit test pass in big-endian. v2: removed duplicate call to r600_colorformat_endian_swap() inside evergreen_init_color_surface_rat() Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-03 09:20:08 +02:00
Tim Rowley	7bb193d28c	mesa/build: add OpenSWR to build Tested on Linux (centos, ubuntu, and suse variants) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:42 -06:00
Tim Rowley	d003be2a30	gallium/docs - add OpenSWR documentation Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:41 -06:00
Tim Rowley	da4f95d168	gallium/target-helpers: add OpenSWR driver Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:41 -06:00
Tim Rowley	ea37602273	gallium/auxilary: more __cplusplus exports swr driver which is written in C++ needs access to some more gallium utility functions than are currently exposed. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:41 -06:00
Tim Rowley	c6e67f5a93	gallium/swr: add OpenSWR rasterizer Acked-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:41 -06:00
Tim Rowley	2b2d3680bf	gallium/swr: add OpenSWR driver OpenSWR is a new software rasterizer for x86 processors designed for high performance and high scalablility on visualization workloads. Acked-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:41 -06:00
Timothy Arceri	2eec41f6f1	glsl: replace remaining tabs in ir_builder.cpp Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2016-03-03 11:25:57 +11:00
Anuj Phogat	7026f27e33	mesa: Update comment Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-02 15:06:46 -08:00
Anuj Phogat	6ccead5b48	mesa: Fix function description Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-02 15:06:46 -08:00
Anuj Phogat	de61849994	meta: Remove the 'allocate_storage' parameter in _mesa_meta_pbo_GetTexSubImage() Texture is already allocated before calling this meta function. So, the value of 'allocate_storage' passed to the function is always false. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-02 15:06:46 -08:00
Anuj Phogat	6d4ebbe9e5	meta: Fix the pbo usage in meta for GLES{1,2} contexts OpenGL ES 1.0 doesn't support using GL_STREAM_DRAW and both ES 1.0 and 2.0 don't support GL_STREAM_READ in glBufferData(). So, handle it correctly by calling the _mesa_meta_begin() before create_texture_for_pbo(). V2: Remove the changes related to allocate_storage. (Ian) Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-02 15:06:45 -08:00
Matt Turner	0d047d10f1	program: Clean up after condition code removal.	2016-03-02 12:15:58 -08:00
Matt Turner	961ead6746	program: Remove variable used only in assert().	2016-03-02 12:15:58 -08:00
Matt Turner	de2ef0401b	program: Drop GL_FRAGMENT_PROGRAM_NV from switch statement.	2016-03-02 12:15:58 -08:00
Jordan Justen	98cdce1ce4	anv/gen7: Use predicated rendering for indirect compute For OpenGL, see commit `9a939ebb47`. Fixes: * dEQP-VK.compute.indirect_dispatch.upload_buffer.empty_command * dEQP-VK.compute.indirect_dispatch.gen_in_compute.empty_command Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-02 12:03:05 -08:00
Jordan Justen	da4745104c	anv: Save batch to local variable for indirect compute Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-02 12:03:05 -08:00
Jason Ekstrand	b0867ca4b2	anv: Fix make check	2016-03-02 11:45:29 -08:00
Samuel Pitoiset	b94a46aa8e	gk110/ir: fix wrong emission of NOT modifier for VOTE Spotted by Coverity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reported-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-02 20:36:18 +01:00
Jason Ekstrand	2168082a48	isl: Fix make check	2016-03-02 11:31:22 -08:00
Jason Ekstrand	8f5a64e44f	gen8/cmd_buffer: Properly return flushed push constant stages This is required on SKL so that we can properly re-emit binding table pointers commands.	2016-03-02 10:48:40 -08:00
Thomas Hindoe Paaboel Andersen	535002f4da	gallium/cso: fix indentation Only one of these were recently introduced. However, since we keep copy/pasting the same wrong indentation we should probably just fix it. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-02 08:55:20 -07:00
Thomas Hindoe Paaboel Andersen	37cfc51b13	st/mesa: move dereference after null check We should not dereference shader before we have done the null check. Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-02 08:55:20 -07:00
Matt Turner	ad17511302	i965/gen6/gs: Replace V-immediate with VF-immediate. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-02 07:28:52 -08:00
Marek Olšák	43f74ac67c	gallium: fix PIPE_BIND_QUERY_BUFFER - PIPE_BIND_SCANOUT overlap Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-03-02 15:32:52 +01:00
Samuel Iglesias Gonsálvez	e8fd60e789	i965: set ctx->Const.MaxViewport{Width,Height} to 32k From ARB_viewport_array spec: " * On GL3-capable hardware the VIEWPORT_BOUNDS_RANGE should be at least [-16384, 16383]. * On GL4-capable hardware the VIEWPORT_BOUNDS_RANGE should be at least [-32768, 32767]." This range is set using ctx->Const.MaxViewportWidth value, so just bump those constants to 32k for gen7+ which can support OpenGL 4.0. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-02 07:19:01 +01:00
Samuel Iglesias Gonsálvez	add57b3fa8	main: remove MAX_VIEWPORT_WIDTH and MAX_VIEWPORT_HEIGHT constants Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-02 07:19:01 +01:00
Samuel Iglesias Gonsálvez	aa849d97a0	main: call invalidate_framebuffer_storage() with driver's viewport limits Don't use hardcoded ones because the driver can set different ones. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-02 07:19:01 +01:00
Jason Ekstrand	5b70aa11ee	anv/meta_blit: Use unorm formats for 8 and 16-bit RGB and RGBA values While Broadwell is very good about UINT formats, HSW is more restrictive. Neither R8G8B8_UINT nor R16G16B16_UINT really exist on HSW. It should be safe to just use the unorm formats.	2016-03-01 21:45:20 -08:00
Kenneth Graunke	89e421369c	Merge remote-tracking branch 'origin/master' into vulkan	2016-03-01 17:11:29 -08:00
Rob Clark	c4ae047cab	freedreno/ir3: enable shareable shaders Now that we are no longer using the pctx reference in the shader, drop it and turn on shareable shaders. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-01 19:21:45 -05:00
Rob Clark	c3f2f8cbe4	freedreno/ir3: pass ctx to constant-emit code Rather than fishing it out of the shader. This removes the other big user of shader->pctx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-01 19:20:44 -05:00
Rob Clark	5fd152bae8	freedreno/ir3: add dev ptr to ir3_compiler And use this for allocating bo's to hold the shader binary, rather than accessing the dev via ctx ptr. One step towards making shaders sharable across contexts. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-01 19:20:33 -05:00
Jason Ekstrand	e941fd8470	genxml: Make the border color pointer consistent across gens	2016-03-01 14:43:05 -08:00
Jason Ekstrand	eecd1f8001	gen7/pipeline: Add competent blending This is mostly a copy-and-paste from gen8. Blending still isn't 100% but it fixes about 1100 CTS blend tests on HSW.	2016-03-01 13:51:58 -08:00
Jason Ekstrand	8b091deb5e	anv: Unify gen7 and gen8 state Now that we've pulled surface state setup into ISL, there's not much to do here.	2016-03-01 12:17:23 -08:00
Matt Turner	1be953797e	mesa: Remove NV_fragment_program remnants from dlist.c. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:30 -08:00
Matt Turner	89abb22a85	mesa: Remove NV_fragment_program_option enable bit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:30 -08:00
Matt Turner	ed72a1c118	program: Remove NV_fragment_program opcode parsing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	5429554f09	program: Remove NV_fragment_program scalar suffix parsing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	409c24f9cc	program: Remove NV_fragment_program_option parsing support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	fe2d2c7ad8	program: Remove NV_fragment_program Abs support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	0d1f6c752f	program: Remove incorrect comment about OPCODE_TXD. The table in prog_instruction.h is correct. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	624d06708d	program: Remove OPCODE_TXP_NV. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	aaef6cf4e3	program: Clean up after previous commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	7b50b0457d	program: Remove condition-code and precision support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	9e11ff7e11	program: Remove OPCODE_KIL_NV. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	a0c3650ad3	program: Remove RelAddr2 support. Looks like more never-used crap from the first geometry shader attempt. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	6b1fb4862e	program: Mark table const. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	fc61b41a95	mesa: Remove EmitCondCodes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	7fe206da28	docs: Remove descriptions of long dead Emit* fields. Dead since commit `d8a366200` in 2010. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	f3b68fc5fc	glsl: Initialize gl_shader_program::EmptyUniformLocations. Commit `65dfb30` added exec_list EmptyUniformLocations, but only initialized the list if ARB_explicit_uniform_location was enabled, leading to crashes if the extension was not available. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-03-01 11:41:29 -08:00
Ian Romanick	1a80ca22fe	i965/meta: Don't pollute the framebuffer namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	8f1b1878a0	i965/meta: Use _mesa_bind_framebuffers instead of _mesa_BindFramebuffer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	3071da3032	meta: Don't pollute the framebuffer namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Fixes piglit tests: - object-namespace-pollution glGetTexImage-compressed framebuffer - object-namespace-pollution glGenerateMipmap framebuffer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	91e5825b8a	meta/decompress: Track framebuffer using gl_framebuffer instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	3ed44fab18	meta/generate_mipmap: Track framebuffer using gl_framebuffer instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	ec5757f9c9	meta: Use _mesa_bind_framebuffers instead of _mesa_BindFramebuffer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	7c254f0200	meta: Use _mesa_CreateFramebuffers instead of _mesa_GenFramebuffers This enables later patches that will stop calling _mesa_GenFramebuffers or _mesa_CreateFramebuffers which pollute the framebuffer namespace. For framebuffers, the Bind call is still necessary. sed -i -e 's/_mesa_GenFramebuffers/_mesa_CreateFramebuffers/' \ src/mesa/drivers/common/*.c Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	6b70c9ea98	i965/meta: Use _mesa_CreateFramebuffers instead of _mesa_GenFramebuffers This enables later patches that will stop calling _mesa_GenFramebuffers or _mesa_CreateFramebuffers which pollute the framebuffer namespace. For framebuffers, the Bind call is still necessary. sed -i -e 's/_mesa_GenFramebuffers/_mesa_CreateFramebuffers/' \ src/mesa/drivers/dri/i965/*.c Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Ian Romanick	f76462cb6f	meta: Save and restore the framebuffer using gl_framebuffer instead of GL API object handle Some meta operations can be called recursively. Future changes (the "Don't pollute the ... namespace" changes) will cause objects with invalid names to be used. If a nested meta operation tries to restore an object named 0xDEADBEEF, it will fail. This also fixes another latent bug in meta. In a multithreaded, multicontext application, one thread can delete an object that is bound in another thread. That object continues to exist until it is unbound (i.e., its refcount drops to zero). Meta unbinds objects all over the place. As a result, the rebind in _mesa_meta_end could fail because the object vanished! See https://bugs.freedesktop.org/show_bug.cgi?id=92363#c8. Using _mesa_reference_<object type> to save and restore the objects prevents the refcount from going to zero. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Ian Romanick	fed9b0ed5a	mesa: Refactor bind_framebuffer to make _mesa_bind_framebuffers Fixing dd_function_table::BindFramebuffer will come later because that change is probably not suitable for stable. v2: Fix whitespace issue noticed by Topi. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Ian Romanick	64aff35f84	meta: Use _mesa_check_framebuffer_status instead of _mesa_CheckFramebufferStatus sed -i -e 's/_mesa_CheckFramebufferStatus(GL_DRAW_FRAMEBUFFER/_mesa_check_framebuffer_status(ctx, ctx->DrawBuffer/' \ -e 's/_mesa_CheckFramebufferStatus(GL_FRAMEBUFFER[^)]*/_mesa_check_framebuffer_status(ctx, ctx->DrawBuffer/' \ -e 's/_mesa_CheckFramebufferStatus(GL_READ_FRAMEBUFFER/_mesa_check_framebuffer_status(ctx, ctx->ReadBuffer/' \ $(grep -rl _mesa_CheckFramebufferStatus src/mesa/drivers) The second expression catches both GL_FRAMEBUFFER and GL_FRAMEBUFFER_EXT. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Ian Romanick	92266ff7a3	meta: Obvious refactor of _mesa_meta_framebuffer_texture_image Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Ian Romanick	f69c743069	meta: Convert _mesa_meta_bind_fbo_image to take a gl_framebuffer instead of a GL API handle Also change the name of the function to _mesa_meta_framebuffer_texture_image. The function is basically a wrapper around _mesa_framebuffer_texture (which is used to implement glFramebufferTexture1D and friends), so it makes sense for it's name to be similar to that. The next patch will clean _mesa_meta_framebuffer_texture_image up considerably. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Jason Ekstrand	6e20c1e058	anv/cmd_buffer: Look at both sides for stencil enable Now it's all consistent with gen9	2016-03-01 11:03:29 -08:00
Jason Ekstrand	4cfdd16500	anv/cmd_buffer: Clean up stencil state setup on gen7	2016-03-01 11:02:21 -08:00
Jason Ekstrand	bb08d86efe	anv/cmd_buffer: Clean up stencil state setup on gen8	2016-03-01 10:58:43 -08:00
Kristian Høgsberg Kristensen	22d8666d74	anv: Add in image->offset when setting up depth buffer Fix from Neil Roberts. https://bugs.freedesktop.org/show_bug.cgi?id=94348	2016-03-01 09:19:39 -08:00
Jason Ekstrand	38f4c11c2f	anv/pipeline: Pull 3DSTATE_SBE into a shared helper	2016-03-01 08:46:32 -08:00
Dave Airlie	ac222626ad	virgl: add support for passing render condition flags to host. This just passes the extra blit info to fix the render condition tests. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-01 15:50:00 +10:00
Jason Ekstrand	3f8df795c1	genxml: Break output detail of 3DSTATE_SBE on gen7 into a struct This makes it work like 3DSTATE_SBE_SWIZ on gen8+ which is much more convenient.	2016-02-29 16:47:42 -08:00
Kenneth Graunke	24994ae926	i965: Push most TES inputs in vec4 mode. (This is commit `4a1c8a3037` for vec4 mode.) Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 24 vec4 slots (12 registers) should suffice. (I chose this instead of the 32 vec4 slots used in the scalar backend to avoid regressing a few Piglit tests due to the vec4 register allocator being too stupid to figure out what to do. We probably ought to fix that, but it's a separate issue.) Improves performance in GPUTest's tessmark_x64 microbenchmark by 41.5394% +/- 0.288519% (n = 115) at 1024x768 on my Clevo W740SU (with Iris Pro 5200). Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 38.3576% +/- 0.759748% (n = 42). v2: Simplify abs/negate handling, as requested by Matt. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-29 16:12:50 -08:00
Marek Olšák	c54f38494c	r600g: remove support for DRM < 2.12.0	2016-03-01 00:18:54 +01:00
Marek Olšák	b7da8fa11d	r300g: remove support for DRM < 2.12.0	2016-03-01 00:18:54 +01:00
Marek Olšák	a5e2a173dd	winsys/radeon: drop support for DRM 2.12.0 (kernel < 3.2) in order to make some winsys interface changes easier This distros should use new DRM if they want to use new Mesa: Distro kernel mesa eol SLES 10 2.6.16 6.4.2 2016-07 SLED 11 3.0 9.0.3 2022-03 RHEL 5 2.6.18 6.5.1 2017-03 RHEL 6 2.6.32 10.4.3 2020-11 Debian 6 2.6.32 7.7.1 2016-02 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-01 00:18:54 +01:00
Marek Olšák	69a8e435ce	radeonsi: also dump shaders on a VM fault Reviewed-by: Christian König <christian.koenig@amd.com>	2016-03-01 00:18:54 +01:00
Marek Olšák	18df72b50b	radeonsi: dump full shader disassemblies into ddebug logs including prolog and epilog disassemblies Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-01 00:18:54 +01:00
Marek Olšák	74b4ce81fb	radeonsi: allow dumping shader disassemblies to a file Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-01 00:18:54 +01:00
Marek Olšák	d0f3b524cd	radeonsi: use re-Z This can increase perf for shaders that kill pixels (kill, alpha-test, alpha-to-coverage). v2: add comments Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-01 00:18:19 +01:00
Marek Olšák	09bfbd43a0	tgsi/scan: count memory instructions for radeonsi Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-01 00:11:32 +01:00
Jason Ekstrand	097564bb8e	anv/cmd_buffer: Dirty push constants when changing pipelines.	2016-02-29 14:36:24 -08:00
Jason Ekstrand	d29fd1c7cb	anv/cmd_buffer: Re-emit push constants packets for all stages	2016-02-29 14:36:24 -08:00
Jason Ekstrand	9715724015	anv/pipeline: Follow push constant alignment restrictions on BDW+ and HSW gt3	2016-02-29 14:36:24 -08:00
Jason Ekstrand	6986ae35ad	anv/pipeline: Avoid a division by zero	2016-02-29 14:36:24 -08:00
Jason Ekstrand	51b618285d	anv/pipeline: Use dynamic checks for max push constants The GEN_GEN macros aren't available in anv_pipeline since it only gets compiled once for the whold driver.	2016-02-29 14:36:24 -08:00
Dave Airlie	35859d5bbb	mesa/fbobject: propogate Layered when reusing attachments. When reusing a depth attachment as a stencil, we need to propogate the layered bit, otherwise we fail to complete the framebuffer. discovered running ./bin/fbo-depth-array depth-layered-clear on virgl on haswell. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-01 07:34:37 +10:00
Nanley Chery	74b7b59db5	isl/surface_state: Fix array spacing on Gen7 v2: Don't cast the enum to a boolean (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-29 11:43:33 -08:00
Kristian Høgsberg Kristensen	9d8bae6137	anv: Don't advertise pipelineStatisticsQuery We don't support that just yet. Reported-by: Jacek Konieczny <jajcus@jajcus.net>	2016-02-29 10:55:39 -08:00
Axel Davy	83bc2acfe9	st/nine: Fix second Multithreading issue with MANAGED buffers Here is another threading issue with MANAGED buffers: Thread 1: buffer creation Thread 1: buffer lock Thread 2: Draw call Thread 1: writes data Thread 1: Unlock Without this patch, the buffer is initially dirty and in the list of things to upload after its creation. The draw call will then upload the data and unset the dirty flag, and the Unlock won't trigger a second upload. Fixes regression introduced by `cc0114f30b`: "st/nine: Implement Managed vertex/index buffers" Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-29 18:55:58 +01:00
Axel Davy	44246fe99d	st/nine: Fix Multithreading issue with MANAGED buffers d3d calls are protected by mutexes, however if app is doing in two threads: Thread 1: buffer Lock Thread 2: Draw call Thread 1: writes data Thread 1: Unlock Then before this patch, the Draw call would begin to upload the buffer. Solves this by moving the moment we add the buffer to the queue of things to upload (We move it from Lock time to Unlock time). Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-29 18:55:58 +01:00
Axel Davy	35c858c42c	st/nine: Handle READONLY for buffer MANAGED pool READONLY won't trigger an upload. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-29 18:55:58 +01:00
Axel Davy	8a8affdfda	st/nine: Use Position input helper for ps3 declared inputs When the semantic is Position (which can happen with index 0 only), use the helper to get Position input. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-29 18:55:58 +01:00
Axel Davy	f08c990af5	st/nine: Introduce helper for Position shader input Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-29 18:55:58 +01:00
Marc-André Lureau	f1d12e7392	virtio_gpu: Add virtio 1.0 PCI ID to driver map Add the virtio-gpu PCI ID for virtio 1.0 (according to the specification, "the PCI Device ID is calculated by adding 0x1040 to the Virtio Device ID") Support for virtio 1.0 was added in qemu 2.4 (same time virtio-gpu landed). Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 11:31:36 +00:00
Koop Mast	04bc09fdf9	st/clover: Add libelf cflags to the build Otherwise the build will fail, when the library is in a non default location. v2 [Emil Velikov] - drop the unneeded cflags from targets/opencl. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Fixes: `7f585a6a98` "configure.ac: use pkg-config for libelf" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93524 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 11:30:15 +00:00
Emil Velikov	c212a70cd9	mesa; add get-extra-pick-list.sh script into bin/ This is a very rudimentary script that checks if any of the applied cherry-picks have been referenced (fixed?) by another patch. With the latter either missing the stable tag or hasn't yet been picked. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-02-29 11:25:35 +00:00
Emil Velikov	64500f21f3	automake: explicitly set distcheck configure flags Pretty much all of these are enabled by default. Considering the recent updates (see previous commits) one might as well list most/all of these here. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:45 +00:00
Emil Velikov	325bc6fb4a	automake: add more missing options for make distcheck Namely - opencl, osmesa (only the gallium flavour as it conflicts with the classic one), surfaceless egl platform and a couple gallium drivers (virgl and vc4). Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:45 +00:00
Emil Velikov	0b6157e971	install-gallium-links: port changes from install-lib-links Namely: `b662d5282f` mesa: Add clean-local rule to remove .lib links. `5c1aac17ad` install-lib-links: don't depend on .libs directory `fece147be5` install-lib-links: remove the .install-lib-links file With these in place, make distcheck now passes and a race condition has been avoided. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:45 +00:00
Rob Herring	51b22bd468	r600: Make enum alu_op_flags unsigned In builds with clang, there are several errors related to the enum alu_op_flags like this: src/gallium/drivers/r600/sb/sb_expr.cpp:887:8: error: case value evaluates to -1610612736, which cannot be narrowed to type 'unsigned int' [-Wc++11-narrowing] These are due to the MSB being set in the enum. Fix these errors by making the enum values unsigned as needed. The flags field that stores this enum also needs to be unsigned. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Cc: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-29 10:51:45 +00:00
Rob Herring	92dd38df5a	gallium/radeon: Add space between string literal and identifier Fix compiles with clang that have this C++11 error: src/gallium/drivers/radeon/r600_pipe_common.h:662:34: error: invalid suffix on literal; C++11 requires a space between literal and identifier [-Wreserved-user-defined-literal] Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Cc: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-29 10:51:45 +00:00
Rob Herring	0156a33aa3	freedreno: drop unnecessary -Wno-packed-bitfield-compat Enabling this warning doesn't generate any warnings with gcc, but is an unknown option for clang, so drop it. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Rob Clark <robdclark@gmail.com> (v1) Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> v2: keep the warning around, commented out Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:45 +00:00
Rob Herring	8949edf018	Android: clean-up and fix DRI module path handling MESA_DRI_MODULE_PATH is only getting set for classic DRI drivers and may or may not be set correctly for gallium_dri.so depending on the makefile include ordering. For Android 6 and earlier it is fine, but with build system changes in AOSP master, it is not. Move the path variables to a single place at the top level and introduce MESA_DRI_MODULE_REL_PATH for Android 5 and later which require relative paths. With this, there is a single variable to change. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Rob Herring	0663edf85b	Android: remove headers from LOCAL_SRC_FILES The Android build system now spits out warnings for header files listed in LOCAL_SRC_FILES, so strip them out. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Rob Herring	6dae9176d6	Android: add -Wno-date-time flag for clang clang complains about date/time macros: src/mesa/main/context.c:403:25: error: expansion of date or time macro is not reproducible [-Werror,-Wdate-time] Disable this warning. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Rob Herring	a2f16db19b	Android: glsl: fix dependence on YACC_HEADER_SUFFIX from build system The makefile was implicitly picking up YACC_HEADER_SUFFIX from the Android build system, but this variable is now gone. Add it locally to fix the build with AOSP master. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Rob Herring	794221fbb7	Android: remove dependence on .SECONDEXPANSION With the Android build system changes to ninja/kati, the use of .SECONDEXPANSION is no longer supported. Fix this by avoiding rule specific variables and using $(transform-generated-source). Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Rob Herring	574a92b048	Android: fix build break from nir/glsl move to compiler/ Commits `a39a8fbbaa` ("nir: move to compiler/") and `eb63640c1d` ("glsl: move to compiler/") broke Android builds. Fix them. There is also a missing dependency between generated NIR headers and several libraries. This isn't a new issue, but seems to have been exposed by the NIR move. Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Cc: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Oded Gabbay	a640ad15e1	gallium/radeon: disable evergreen_do_fast_color_clear for BE This function is currently broken for BE. I assume it's because of util_pack_color(). Until I fix this path, I prefer to disable it so users would be able to see correct colors on their desktop and applications. Together with the two following patches: - gallium/r600: Don't let h/w do endian swap for colorformat - gallium/radeon: remove separate BE path in r600_translate_colorswap it fixes BZ#72877 and BZ#92039 Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-29 12:26:27 +02:00
Oded Gabbay	e3dfc0e095	gallium/r600: Don't let h/w do endian swap for colorformat Since the rework on gallium pipe formats, there is no more need to do endian swap of the colorformat in the h/w, because the conversion between mesa format and gallium (pipe) format takes endianess into account (see the big #if in p_format.h). v2: return ENDIAN_NONE only for four 8-bits components (V_0280A0_COLOR_8_8_8_8) Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-29 12:26:27 +02:00
Oded Gabbay	9559071ed6	gallium/radeon: remove separate BE path in r600_translate_colorswap After further testing, it appears there is no need for separate BE path in r600_translate_colorswap() The only fix remaining is the change of the last if statement, in the 4 channels case. Originally, it contained an invalid swizzle configuration that never got hit, in LE or BE. So the fix is relevant for both systems. This patch adds an additional 120 available visuals for LE and BE, as seen in glxinfo v2: Tested for regressions by running piglit gpu.py with CAICOS (r600g) on x86-64 machine. No regressions found. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-29 12:26:27 +02:00
Samuel Pitoiset	07ed003faf	nv50/ir: emit VOTE instruction Changes from v2: - add missing NOT modifier for GK110/GM107 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-28 23:58:11 +01:00
Jordan Justen	635c0e92b7	anv: Set CURBEAllocationSize in MEDIA_VFE_STATE Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-28 11:54:49 -08:00
Jordan Justen	1af5dacd76	anv/gen7: Enable SLM in L3 cache control register Port `1983003` to gen7. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-28 11:54:49 -08:00
Kristian Høgsberg Kristensen	b00b42d99b	nir/spirv: Use the new bare sampler type	2016-02-28 11:24:05 -08:00
Jordan Justen	72efb68d48	anv/pipeline: Set URB offset to zero if size is zero After `3ecd357d81`, it may be possible for the VS to get assigned all of the URB space. On Ivy Bridge, this will cause the offset for the other stages to be 16, which cannot be packed into the ConstantBufferOffset field of 3DSTATE_PUSH_CONSTANT_ALLOC_*. Instead we can set the offset to zero if the stage size is zero. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-28 10:51:38 -08:00
Jordan Justen	ef06ddb08a	anv/pipeline: Set FS URB space to zero if the FS is unused Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-28 10:51:38 -08:00
Jordan Justen	45d8ce07a5	anv/pipeline: Set stage URB size to zero if it is unused Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-28 10:49:39 -08:00
Samuel Pitoiset	b3efa0a59e	gk110/ir: add ld lock/st unlock emission Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-28 19:20:20 +01:00
Ilia Mirkin	aa3b85fd18	nv50,nvc0: bump minimum texture buffer offset alignment It appears that it actually needs to be aligned to the datum size, so it was 1 when testing with R8, but it can be as high as 16 with RGBA32. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-02-27 16:26:34 -05:00
Jason Ekstrand	46b7c242da	anv/gen7: Clean up the dummy PS case Fix whitespace and remove dead comments	2016-02-27 11:24:09 -08:00
Jason Ekstrand	e18a2f037a	anv/gen7: Set MaximumNumberofThreads in the dummy PS packet	2016-02-27 11:23:56 -08:00
Jason Ekstrand	ad50896c87	anv/gen7: Only try to get the depth format the surface has depth	2016-02-27 11:23:18 -08:00
Jason Ekstrand	4b34f2ccb8	anv/image: Use isl for filling brw_image_param	2016-02-27 10:26:14 -08:00
Jason Ekstrand	bd6470fa6c	isl: Add helpers for filling out brw_image_param	2016-02-27 10:26:14 -08:00
Jason Ekstrand	7363024cbd	anv: Fill out image_param structs at view creation time	2016-02-27 10:26:14 -08:00
Jason Ekstrand	e9d126f23b	anv/image: Add a ussage_mask field to image_view_init This allows us to avoid doing some unneeded work on the meta paths where we know that the image view will be used for exactly one thing. The meta paths also sometimes do things that aren't quite valid like setting the array slice on a 3-D texture and we want to limit the number of paths that need to be able to sensibly handle the lies.	2016-02-27 10:26:14 -08:00
Jason Ekstrand	b4c16fd01a	isl: Move isl_image.c to isl_storage_image.c	2016-02-27 10:26:14 -08:00
Jason Ekstrand	eb19d640eb	anv: Use isl to fill buffer surface states	2016-02-27 10:26:14 -08:00
Jason Ekstrand	a0cd20eb7f	isl: Add a helper for filling a buffer surface state	2016-02-27 10:26:14 -08:00
Jason Ekstrand	9d5b8f7709	anv: Remove unneeded fiels from anv_image_view	2016-02-27 10:26:14 -08:00
Jason Ekstrand	b70a8d40fa	anv/state: Remove unused fill_surface_state functions	2016-02-27 10:26:14 -08:00
Jason Ekstrand	ded57c3cca	anv: Use ISL to fill out surface states	2016-02-27 10:26:14 -08:00
Jason Ekstrand	4a9b805ce5	anv/device: Store the default MOCS in the device	2016-02-27 10:26:13 -08:00
Jason Ekstrand	d798762cdb	isl: Add a function for filling out a surface state	2016-02-27 10:26:13 -08:00
Jason Ekstrand	6b06072ba8	isl: Create per-gen helper libraries for gens 7, 8, and 9	2016-02-27 10:26:13 -08:00
Jason Ekstrand	82d2db80bb	genxml: Add MOCS fields to RENDER_SURFACE_STATE This allows us to set MOCS as a single uint32_t on all platforms.	2016-02-27 10:26:13 -08:00
Jason Ekstrand	452782f68b	gen/genX_pack: Add genxml to the pack header path If you have an out-of-tree build, gen8_pack.h and friends will not be in the same folder as genX_pack.h so this will be a problem. We fixed out-of-tree earlier by adding the genxml folder to the includes for the vulkan driver. However, this is not a good long-term solution because we want to use it in ISL as well.	2016-02-27 10:26:13 -08:00
Ilia Mirkin	e2dce1a340	mesa: add GL_OES_gpu_shader5 and GL_EXT_gpu_shader5 support The two extensions are identical, and are largely taking bits of already existing desktop functionality. We continue to do a poor job of supporting the 'precise' keyword, just like we do on desktop. This passes the relevant dEQP tests that I could find. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-27 00:08:28 -05:00
Ilia Mirkin	2875183463	mesa: expose GL_EXT_texture_sRGB_decode on GLES 3.0+ Could be exposed on earlier GLES versions if we supported EXT_sRGB, but we don't, for now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-26 23:55:45 -05:00
Nanley Chery	265d4c415c	isl: Fix isl_surf_get_image_intratile_offset_el() Consecutive tiles are separated by the size of the tile, not by the logical tile width. v2: Remove extra subtraction (Ville) Add parenthesis (Jason) v3: Update the unit tests for the function Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-26 16:59:36 -08:00
Ian Romanick	585b18f305	i965/cfg: Fix comment list punctuation Trivial Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-26 16:51:27 -08:00
Ian Romanick	5bfb302783	i965/cfg: Split out dead control flow paths to simplify both paths v2: Fix some bad indentation. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 16:51:27 -08:00
Ian Romanick	2513a20240	i965/cfg: Don't handle fully empty if/else/endif This will now never occur. The empty if-else part would have already been removed leaving an empty if-endif part. No shader-db changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 16:51:27 -08:00
Ian Romanick	69bb063ec2	i965/cfg: Eliminate an empty then-branch of an if/else/endif On BDW, total instructions in shared programs: 8448571 -> 8448367 (-0.00%) instructions in affected programs: 21000 -> 20796 (-0.97%) helped: 116 HURT: 0 v2: Remove spurious attempt to combine the if_block with the (removed!) else_block. Suggested by Matt and Curro. Correct the comment describing what the new pass does. Suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 16:51:27 -08:00
Ian Romanick	c7deee69ea	i965/cfg: Track prev_block and prev_inst explicitly in the whole function This provides a trivial simplification now, and it makes some future changes more straight forward. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 16:51:27 -08:00
Ian Romanick	70cf0eb5c7	i965/cfg: Slightly rearrange dead_control_flow_eliminate 'git diff -w' is a bit more illustrative. A couple declarations were moved, the continue was removed, and the code was reindented. This will simplify future changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 16:51:27 -08:00
Thomas Hindoe Paaboel Andersen	6bb6b5c341	anv: remove stray ; after if Both logic and indentation suggests that the ; were not intended here. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-26 16:05:28 -08:00
Jason Ekstrand	b7bc52b5b1	anv/gen8: Emit the 3DSTATE_PS_BLEND packet	2016-02-26 16:04:48 -08:00
Kenneth Graunke	a0294c2cf3	i965: Simplify brw_nir_lower_vue_inputs() slightly. The same code appeared in both branches; pull it above the if statement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	8151003ade	i965: Avoid recalculating the normal VUE map for IO lowering. The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	15b3639bf1	i965: Avoid recalculating the tessellation VUE map for IO lowering. The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	cfbd9831f8	i965: Eliminate brw_nir_lower_{inputs,outputs,io} functions. Now that each stage is directly calling brw_nir_lower_io(), and we have per-stage helper functions, it makes sense to just call the relevant one directly, rather than going through multiple switch statements. This also eliminates stupid function parameters, such as the two that only apply to vertex attributes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	b96ddd2e52	i965: Split brw_nir_lower_inputs/outputs into per-stage functions. These functions are both giant switch statements where most cases don't overlap at all. Let's put the bulk of the work in per-stage helpers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	d33c478bed	i965: Remove catch-all nir_lower_io call with specific cases. Most cases already call nir_lower_io explicitly for input and output lowering. This catch all isn't very useful anymore - we can just add it to the remaining cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	51f8797993	i965: Move optimizations from brw_nir_lower_io to brw_postprocess_nir. This simplifies things. Every caller of brw_nir_lower_io() immediately calls brw_postprocess_nir(). The only real change this will have is that we get an extra brw_nir_optimize() call when compiling compute shaders, but that seems fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	dcd4a841e9	i965: Always do NIR IO lowering at specialization time. We've now hit literally every case other than geometry shaders (and compute shaders, but those are a no-op). So, let's just move geometry shaders over too and be done with it. The only advantage to doing this at link time was to save the expense of running the pass on recompiles. But we're already running a lot of passes, and the extra code complexity isn't worth it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	fa7135107f	i965: Make an is_scalar boolean in brw_compile_gs(). Shorter than compiler->scalar_stage[MESA_SHADER_GEOMETRY], which can help with line-wrapping. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Jason Ekstrand	b3cb6e78aa	i965/nir: Do lower_io late for fragment shaders The Vulkan driver wants to be able to delete fragment outputs that are beyond key.nr_color_regions; this is a lot easier if we lower outputs at specialization time rather than link time. (Rationale added to commit message by Ken) Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Jordan Justen	7428e6f86a	i965: Set dest type to UW for several send messages Without this, on SIMD 16 the send instruction destination will appear to write more than one destination register, causing the simulator to report an error. Of course, the send instruction can actually write more than one destination register regardless of the type set for the destination, so this is a bit strange. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 12:03:56 -08:00
Samuel Pitoiset	aad48f8691	nvc0: rework nvc0_compute_validate_program() Reduce the amount of duplicated code by re-using nvc0_program_validate(). While we are at it, change the prototype to return void and remove nvc0_compute.h which is now useless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-26 14:00:27 +01:00
Samuel Pitoiset	e1f5c76047	nvc0: make sure to validate compute global buffers on Fermi No reason to not validate those global buffers and this might avoid fails if someone try to use the global memory from compute programs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-26 14:00:23 +01:00
Samuel Pitoiset	dcf7938833	nvc0: move nvc0_validate_global_residents() to nvc0_compute.c While we are at it, rename it to nvc0_compute_validate_globals() and update its prototype. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-26 14:00:18 +01:00
Derek Foreman	d085a5dff5	egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamage Since commit `d1314de293` we ignore damage passed to SwapBuffersWithDamage. Wayland 1.10 now has functionality that allows us to properly process those damage rectangles, and a way to query if it's available. Now we can use wl_surface.damage_buffer and interpret the incoming damage as being in buffer co-ordinates. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Signed-off-by: Derek Foreman <derekf@osg.samsung.com>	2016-02-26 11:49:09 +00:00
Dave Airlie	840aa52f50	virgl: add missing CAP turned off.	2016-02-26 04:03:09 +00:00
Miklós Máté	847f1cc698	program: Remove extra reference_program() It was already done in get_mesa_program() Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-02-25 22:02:50 +01:00
Emil Velikov	51c65a4c48	automake: add nine to make distcheck Will allow us to catch/prevent issues, like the one in mesa 11.2.0-rc1. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-25 19:56:07 +00:00
Emil Velikov	b08dbc84fe	st/nine: don't forget to bundle the nine_limits.h file Without this mesa 11.2.0-rc1 ended up busted :-( Cc: "11.2" <mesa-stable@lists.freedesktop.org> Repored-by: Ondřej Súkup <mimi.vx@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-25 19:56:07 +00:00
Matt Turner	4009a9ead4	i965/fs: Allow saturate propagation to propagate negations into MADs. Allows us to transform mad res src0 src1 src2 mov.sat dst -res into mad.sat dst -src0 -src1 src2 instructions in affected programs: 3712 -> 3688 (-0.65%) helped: 24 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-25 10:51:15 -08:00
Matt Turner	65d3217cb0	i965/fs: Allow saturate propagation to propagate negations into ADDs. Allows us to transform add res src0 src1 mov.sat dst -res into add.sat dst -src0 -src1 No shader-db changes. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-25 10:51:13 -08:00
Matt Turner	7b6113bc2d	i965/fs: Allow saturate propagation to propagate negations into MULs. Allows us to transform mul res src0 src1 mov.sat dst -res into mul.sat dst src0 -src1 instructions in affected programs: 45246 -> 45054 (-0.42%) helped: 162 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-25 10:51:10 -08:00
Matt Turner	1567da1e28	i965/fs: Don't CSE negated multiplies with saturation. It's not correct to CSE these multiplies mul.sat dst1, -a, b mul.sat dst2, a, b by emitting a negated MOV from dst1 to dst2: mul.sat dst1, -a, b mov dst2, -dst1 Take 2.0*2.0 for example. The first multiply would produce 0.0 and the second would produce 1.0. Fixes bad generated code in 18 to 22 shaders: instructions in affected programs: 432 -> 464 (7.41%) helped: 4 HURT: 18 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-25 10:51:04 -08:00
Matt Turner	3da789f1e9	glsl: Consider ubo_load to be a horizontal operation. Unclear to me whether it actually is a horizontal operation that cannot be vectorized, but the fact that i965 generates the same code in either case makes me less interested in finding out. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94199 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-25 10:50:34 -08:00
Jason Ekstrand	c32273d246	anv/device: Properly handle apiVersion == 0 From the Vulkan 1.0 spec section 3.2: "If apiVersion is 0 the implementation must ignore it"	2016-02-25 08:52:37 -08:00
Andres Gomez	d1509a5848	glsl/ast: Implicit conversion from double to float is not allowed Also, renamed get_conversion_operation to avoid future misunderstandings. Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-25 13:10:50 +01:00
Oded Gabbay	439b5b008f	gallium/radeon: return correct values for BE in r600_translate_colorswap Because I changed the swizzle check, I also need to adapt the return values for each check. It's basically almost the same as before, we just cross between STD and STD_REV, and cross between ALT and ALT_REV This fixes the rgba test in gl-1.0-readpixsanity (piglit) and also fixes tri-flat (mesa demos). Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-25 09:21:08 +02:00
Oded Gabbay	ff8b41b702	gallium: remove duplicate define from enum pipe_format Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-25 09:21:08 +02:00
Ian Romanick	9d9aeb91b1	glsl: Detect do-while-false loops and unroll them Previously loops like do { // ... } while (false); that did not have any other loop-branch instructions would not be unrolled. This is commonly used to wrap multiline preprocessor macros. This produces IR like (loop ( ... break )) Since limiting_terminator was NULL, the loop unroller would throw up its hands and say, "I don't know how many iterations. How can I unroll this?" We can detect this another way. If there is no limiting_terminator and the only loop-branch is a break as the last IR, there's only one iteration. On my very old checkout of shader-db, this removes a loop from Orbital Explorer, but it does not otherwise affect the shader. The loop removed is the one the compiler inserts surrounding the switch statement. This change does prevent some seriously bad code generation in some patches to meta shaders that I recently sent out for review. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-24 18:43:40 -08:00
Nanley Chery	3eb476fa14	i965: Enable tiled mem_copy with sRGB-formatted resources RGBA8 and BGRA8 unorm formats are compatible with the various mem_copy functions. Their sRGB counterparts are also compatible because they're also color-renderable (of importance when the specified resource is a readbuffer) and they share the same physical layout. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-24 14:40:34 -08:00
Kristian Høgsberg Kristensen	59f5728995	Merge remote-tracking branch 'origin/master' into vulkan	2016-02-24 13:04:54 -08:00
Kristian Høgsberg Kristensen	25c2470b24	anv: Set max_hs_threads/max_ds_threads	2016-02-24 12:21:26 -08:00
Kenneth Graunke	3ecd357d81	anv: Allocate more push constant space. Previously we allocated 4kB of push constant space for VS, GS, and PS (for a total of 12kB) no matter what. This works, but doesn't fully utilize the space - we have 16kB or 32kB of space. This makes anv use the same method as brw - divide up the space evenly among all active shader stages. This means HS and DS would get space, if those shader stages existed. In the future, we can probably do better by inspecting how many push constants each shader stage uses, and weight things accordingly. But this is strictly better than the old code, and ideally we'd justify a fancier solution with actual performance data.	2016-02-24 11:22:05 -08:00
Kenneth Graunke	3f11517730	anv: Properly size the push constant L3 area. We were assuming it was 32kB everywhere, reducing the available URB space. It's actually 16kB on Ivybridge, Baytrail, and Haswell GT1-2.	2016-02-24 11:13:08 -08:00
Kenneth Graunke	7f9b03cc8b	anv: Emit 3DSTATE_PUSH_CONSTANT_ALLOC_* via a loop. Now we're emitting HS and DS packets as well.	2016-02-24 11:13:08 -08:00
Kenneth Graunke	1024a66fc4	anv: Emit 3DSTATE_URB_* via a loop. Rather than keeping separate {vs,hs,ds,gs}_start fields, we now store an array indexed by the shader stage (MESA_SHADER_). The 3DSTATE_URB_ commands are also sequentially numbered. This makes it easy to just emit them in a loop. This simplifies the code a little, and also will make it easier to add more credible HS and DS code later.	2016-02-24 11:13:02 -08:00
Brian Paul	c95d5c5f6f	mesa: replace for loop with bitshifting in supported_buffer_bitmask() Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:32:01 -07:00
Brian Paul	ac37d0475c	mesa: updates some comments in buffers.c Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:53 -07:00
Brian Paul	d8412029bb	mesa: make _mesa_draw_buffers() static Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:44 -07:00
Brian Paul	24d8080507	mesa: make _mesa_draw_buffer() static Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:41 -07:00
Brian Paul	ebfcf9de43	mesa: make _mesa_read_buffer() static Not called from any other file. Remove _mesa_ prefix and update comments. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:37 -07:00
Brian Paul	1e41c2e135	mesa: move declaration of buffer var in handle_first_current() Declare the var in the scopes where it's used. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:31 -07:00
Brian Paul	c8fdb42c91	mesa: use gl_buffer_index in a few places Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:28 -07:00
Brian Paul	363019e17a	st/mesa: remove useless break statement Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:23 -07:00
Brian Paul	953cb24e65	st/mesa: rename st_readpixels to st_ReadPixels To match the convention of other device driver functions. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:17 -07:00
Brian Paul	83b589301f	st/mesa: fix frontbuffer glReadPixels regressions The change "mesa/readpix: Don't clip in _mesa_readpixels()" caused a few piglit regressions. The failing tests use glReadPixels to read from the front color buffer. The problem is we were trying to read from a non-existant front color buffer. The front color buffer is created on demand in st/mesa. Since the missing buffer bounds were effectively 0 x 0 the glReadPixels was totally clipped and returned early. The fix involves creating the real front color buffer when we're about to try reading from it. Tested with llvmpipe and VMware driver on Linux, Windows. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94253 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94254 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94257 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:30:07 -07:00
Jason Ekstrand	c9564fd598	nir/spirv: Allow but warn for a few capabilities Unfortunately, glslang gives us cull/clip distance and GS streams even if the shader doesn't use it whenever a shader is declared as version 450. This is a glslang bug, but we can easily enough ignore it for now.	2016-02-23 22:07:25 -08:00
Jason Ekstrand	f0f7cc22f3	anv/descriptor_set: Use the correct size for the descriptor pool The descriptor sizes array gives the total number of each type of descriptor that will ever be allocated from the pool, not the total amount that may be in any particular set. In our case, this simply means that we have to sum a bunch of things up and there we go.	2016-02-23 21:25:37 -08:00
Jason Ekstrand	040355b688	nir/spirv: Add more capabilities	2016-02-23 21:01:00 -08:00
Jason Ekstrand	bd3db3d665	anv/meta: Allocate descriptor pools on-the-fly We can't use a global descriptor pool like we were because it's not thread-safe. For now, we'll allocate them on-the-fly and that should work fine. At some point in the future, we could do something where we stack-allocate them or allocate them out of one of the state streams.	2016-02-23 17:04:19 -08:00
Oded Gabbay	4b7e219e61	gallium/radeon: Correctly translate colorswaps for big endian The current code in r600_translate_colorswap uses the swizzle information to determine which colorswap to use. This works for BE & LE when the nr_channels is <4, but when nr_channels==4 (e.g. PIPE_FORMAT_A8R8G8B8_UNORM), this method can not be used for both BE and LE, because the swizzle info is the same for both of them. As a result, r600g doesn't support 24bit color formats, only 16bit, which forces the user to choose 16bit color in X server. This patch fixes this bug by separating the checks for LE and BE and adapting the swizzle conditions in the BE part of the checks. Tested on an Evergreen GPU (Cedar GL FirePro 2270) running inside POWER7 Big-Endian Machine. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> CC: "11.2" "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-23 20:55:40 +02:00
Thomas Hindoe Paaboel Andersen	1807806add	mesa: use sizeof on the correct type Before the luminance stride was based on the size of GL_FLOAT which is just the type constant (0x1406). Change it to use the size of GLfloat. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-23 08:55:35 -07:00
Marek Olšák	190a291b03	tgsi/scan: handle holes between VS inputs, assert-fail in other cases "st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmap" added a vertex shader declaring IN[0] and IN[2], but not IN[1]. Drivers relying on tgsi_shader_info can't handle holes in declarations, because tgsi_shader_info doesn't track that. This is just a quick workaround meant for stable that will work for vertex shaders. This fixes radeonsi DrawPixels and CopyPixels crashes. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-23 16:42:16 +01:00
Jason Ekstrand	bfbb238dea	anv/descriptor_set: Set descriptor type for immuatable samplers	2016-02-22 21:39:14 -08:00
Jason Ekstrand	64e1c84059	intel/genxml: Update macro documentation	2016-02-22 21:20:04 -08:00
Francisco Jerez	31a0affa28	docs: Mark off GL_OES_shader_image_atomic as done. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 19:59:56 -08:00
Francisco Jerez	058ed980c6	i965/fs: Return result of image atomic in a register of the expected type. So the result is of float type if we're implementing the float overload of imageAtomicExchange. This is the only back-end change required to support OES_shader_image_atomic AFAICT. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 19:57:09 -08:00
Francisco Jerez	81c16a2dab	glsl: Implement the required built-in functions when OES_shader_image_atomic is enabled. This is basically just the same atomic functions exposed by ARB_shader_image_load_store, with one exception: "highp float imageAtomicExchange( coherent IMAGE_PARAMS, float data);" There's no float atomic exchange overload in the original ARB_shader_image_load_store or GL 4.2, so this seems like new functionality that requires specific back-end support and a separate availability condition in the built-in function generator. v2: Move image availability predicate logic into a separate static function for clarity. Had to pull out the image_function_flags enum from the builtin_builder class for that to be possible. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 19:56:54 -08:00
Francisco Jerez	be125af95e	glsl: Add usual extension boilerplate for OES_shader_image_atomic. v2: No need for extension enable bits (Ilia). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 19:56:35 -08:00
Francisco Jerez	009bbecf6d	mesa: Add extension table entry for OES_shader_image_atomic. v2: No need for extension enable bits (Ilia). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 19:55:35 -08:00
Jason Ekstrand	ae619a0355	anv/state: Replace a bunch of ANV_GEN with GEN_GEN	2016-02-22 19:19:00 -08:00
Jason Ekstrand	442dff8cf4	anv/descriptor_set: Stop marking everything as having dynamic offsets	2016-02-22 17:23:29 -08:00
Kristian Høgsberg Kristensen	2570a58bcd	anv: Implement descriptor pools Descriptor pools are an optimization that lets applications allocate descriptor sets through an externally synchronized object (that is, unlocked). In our case it's also plugging a memory leak, since we didn't track all allocated sets and failed to free them in vkResetDescriptorPool() and vkDestroyDescriptorPool().	2016-02-22 17:13:51 -08:00
Kristian Høgsberg Kristensen	353d5bf286	anv/x11: Free swapchain images and memory on destroy	2016-02-22 16:23:47 -08:00
Samuel Pitoiset	2999257e0f	nvc0: rename 3d binding points to NVC0_BIND_3D_XXX Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	9c6a7bfb40	nvc0: rename 3d dirty flags to NVC0_NEW_3D_XXX Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	2c48369f54	nvc0: prefix compute macros with _CP_ instead of _COMPUTE_ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	bbff97ae39	nvc0: rename NVXX_COMPUTE to NVXX_CP Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	5330ed959e	nvc0: rename nvc0_context::dirty to nvc0_context::dirty_3d Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	84b9b8f0a3	nvc0/ir: add missing emission of locked load predicate Like unlocked store on shared memory, locked store can fail and the second dest which is a predicate must be emitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	9f0d059d4b	nvc0/ir: add ld lock/st unlock emission on GK104 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	6526225f88	nv50/ir: restore OP_SELP to be a regular instruction Actually OP_SELP doesn't need to be a compare instruction. Instead we just need to set the NOT modifier when building the instruction. While we are at it, fix the dst register type and use a GPR. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Mark Janes	08b408311c	vulkan: fix out-of-tree builds	2016-02-22 11:31:15 -08:00
Brian Paul	9de3b0273d	svga: unbind index buffer when drawing non-indexed primitives Silences a warning reported by the svga3d device. v2: also null-out the index buffer pointer Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-02-22 12:14:48 -07:00
Kristian Høgsberg Kristensen	f843aabdd4	intel/genxml: Add README I've had people ask about the design of the pack functions, for example, why aren't we using bitfields. I wrote up a bit of background on why and how we ended up with the current design and we might as well keep that with the code.	2016-02-22 09:14:25 -08:00
Nanley Chery	7b2c63a53c	anv/meta_blit: Handle compressed textures in anv_CmdCopyImage As with anv_CmdCopyBufferToImage, compressed textures require special handling during copies. Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-02-22 09:04:28 -08:00
Ilia Mirkin	571bd9ac42	mesa: add GL_EXT_texture_border_clamp support This extension is identical to GL_OES_texture_border_clamp. But dEQP has tests that want the EXT variant. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-22 10:38:56 -05:00
Ilia Mirkin	b6654831c3	mesa: add GL_OES_texture_border_clamp support Only minor differences to the existing ARB_texture_border_clamp support. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-22 10:38:56 -05:00
Ilia Mirkin	af8ad49541	mesa: bump version 11.2 has been branched, we're on 11.3 now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-22 10:38:37 -05:00
Jason Ekstrand	f49ba0f7d8	nir/spirv: Add support for multisampled textures	2016-02-21 22:02:38 -08:00
Jason Ekstrand	f1dddeadc2	anv: Fix a typo in apply_dynamic_offsets shader->num_uniforms is in terms of bytes in i965.	2016-02-20 21:24:31 -08:00
Jason Ekstrand	b5868d2343	anv: Zero out the WSI array when initializing the instance	2016-02-20 19:30:14 -08:00
Jason Ekstrand	bc696f1db6	isl: Stop including mesa/main/imports.h It pulls in all sorts of stuff we don't want.	2016-02-20 10:35:25 -08:00
Jason Ekstrand	853fc3e431	genxml: Add mote includes in the generated headers	2016-02-20 09:33:20 -08:00
Jason Ekstrand	1f1cf6fcb0	anv: Get rid of GENX_FUNC It was a bad idea.	2016-02-20 09:12:38 -08:00
Jason Ekstrand	371b4a5b33	anv: Switch over to the macros in genxml	2016-02-20 09:09:28 -08:00
Jason Ekstrand	0d76aa9485	intel/genxml: Add a couple of helper headers	2016-02-20 08:35:36 -08:00
Jason Ekstrand	2b85807458	genxml: Stop using unicode in the pack generator This causes python problems and problems when people don't have a locale set properly in their shell.	2016-02-19 08:05:35 -08:00
Dave Airlie	1375cb3c27	anv: fix warning about unused width variable. We don't use width outside the debug clause here.	2016-02-19 08:01:54 -08:00
Jason Ekstrand	698ea54283	anv/pipeline: Fix a typo in the pipeline layout code	2016-02-18 13:55:57 -08:00
Jason Ekstrand	d5bb23156d	anv/allocator: Set is_winsys_bo to false for block pool BOs	2016-02-18 13:55:57 -08:00
Mark Janes	1b37276467	vulkan: fix out-of-tree build We need to be able to find the generated gen*pack.h headers. Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-18 12:30:27 -08:00
Jason Ekstrand	e0565f40ea	anv/pipeline: Use nir's num_images for allocating image_params	2016-02-18 11:44:26 -08:00
Jason Ekstrand	79c0781f44	nir/gather_info: Count textures and images	2016-02-18 11:42:36 -08:00
Jason Ekstrand	e881c73975	anv/pipeline: Don't leak the binding map	2016-02-18 11:09:30 -08:00
Jason Ekstrand	8c23392c26	anv/formats: Don't use a compound literal to initialize a const array Doing so makes older versions of GCC rather grumpy. Newere GCC fixes this, but using a compound literal isn't really gaining us anything anyway.	2016-02-18 10:44:08 -08:00
Jason Ekstrand	9851c8285f	Move the intel vulkan driver to src/intel/vulkan	2016-02-18 10:37:59 -08:00
Jason Ekstrand	47b8b08612	Move isl to src/intel	2016-02-18 10:34:47 -08:00
Jason Ekstrand	f6d9587688	vulkan: Move XML and generator into src/intel/genxml	2016-02-18 10:30:29 -08:00
Kristian Høgsberg Kristensen	542c38df36	anv/meta: Initialize blend state for the right attachment We were always initializing only RT 0. We need to initialize the RT we're creating the clear pipeline for.	2016-02-18 10:22:50 -08:00
Kristian Høgsberg Kristensen	05f75a3026	anv/meta: Don't use the blit ds layout in resolve code	2016-02-18 10:22:50 -08:00
Jason Ekstrand	40c76d4efa	Delete nir_lower_samplers.cpp Somehow, in one of the merges with mesa master, the old file must have been kept when nir_lower_samplers.cpp was moved to nir_lower_samplers.c.	2016-02-17 20:16:11 -08:00
Jason Ekstrand	005b9ac758	anv: Gut anv_pipeline_layout Almost none of the data in anv_pipeline_layout is used anymore thanks to doing real layout in the pipeline itself.	2016-02-17 18:04:40 -08:00
Kristian Høgsberg Kristensen	c2581a9375	anv: Build the real pipeline layout in the pipeline This gives us the chance to pack the binding table down to just what the shaders actually need. Some applications use very large descriptor sets and only ever use a handful of entries. Compacted binding tables should be much more efficient in this case. It comes at the down-side of having to re-emit binding tables every time we switch pipelines, but that's considered an acceptable cost.	2016-02-17 18:04:39 -08:00
Jason Ekstrand	581e4468f9	nir/spirv: Add some more capabilities	2016-02-17 18:04:39 -08:00
Jason Ekstrand	fed8b7f817	anv/pipeline: Delete out-of-bounds fragment shader outputs	2016-02-17 18:04:39 -08:00
Jason Ekstrand	979732fafc	nir: Add a helper for getting the one function from a shader	2016-02-17 18:04:39 -08:00
Jason Ekstrand	8c05b44bbb	nir: Add a nir_foreach_variable_safe helper	2016-02-17 18:04:39 -08:00
Jason Ekstrand	d67d84f5e5	i965/nir: Do lower_io late for fragment shaders	2016-02-17 18:04:39 -08:00
Jason Ekstrand	7c26d8d471	anv/gen7_pipeline: Set WriteDisable = true if we have no color attachments	2016-02-17 18:04:39 -08:00
Jason Ekstrand	9f9cd3de44	anv/gen8_pipeline: Default color attachments to WriteDisable = true	2016-02-17 18:04:39 -08:00
Jason Ekstrand	da9fd74d34	anv: Pull StencilBufferWriteEnable from both sides	2016-02-17 18:04:39 -08:00
Nanley Chery	9963af8bbd	anv: Ignore unused dimensions in vkCreateImage's anv_image We ignore unused dimensions in the isl surface; do the same for the resulting anv_image. Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-02-17 17:32:26 -08:00
Kristian Høgsberg Kristensen	b8da261dc7	spirv: Fix SpvOpFwidth, SpvOpFwidthFine and SpvOpFwidthCoarse "Result is the same as computing the sum of the absolute values of OpDPdx and OpDPdy on P." We were doing sum of absolute values of OpDPdx of P and OpDPdx of NULL.	2016-02-17 15:28:52 -08:00
Kristian Høgsberg Kristensen	ae3e249d57	anv: Remove hacky PIPE_CONTROL in vkCmdEndRenderPass() The vkCmdPipelineBarrier() command should work as intended now and we need to pull the plug on this old hack.	2016-02-17 15:19:07 -08:00
Kristian Høgsberg Kristensen	5e92e91c61	anv: Rework vkCmdPipelineBarrier() We don't need to look at the stage flags, as we don't really support any fine-grained, stage-level synchronization. We have to do two PIPE_CONTROLs in case we're both flushing and invalidating. Additionally, if we do end up doing two PIPE_CONTROLs, the first, flusing one also has to stall and wait for the flushing to finish, so we don't re-dirty the caches with in-flight rendering after the second PIPE_CONTROL invalidates.	2016-02-17 15:18:06 -08:00
Kristian Høgsberg Kristensen	3b9b908054	anv: Ignore unused dimensions in vkCreateImage We would assert on unused dimensions (eg extent.depth for VK_IMAGE_TYPE_2D) not being 1, but the specification doesn't put any constraints on those. For example, for VK_IMAGE_TYPE_1D: "If imageType is VK_IMAGE_TYPE_1D, the value of extent.width must be less than or equal to the value of VkPhysicalDeviceLimits::maxImageDimension1D, or the value of VkImageFormatProperties::maxExtent.width (as returned by vkGetPhysicalDeviceImageFormatProperties with values of format, type, tiling, usage and flags equal to those in this structure) - whichever is higher" We'll fix up the arguments to isl to keep isl strict in what it expects.	2016-02-17 12:21:51 -08:00
Kristian Høgsberg Kristensen	b63e28c0e1	anv: Set correct write domain on window system BOs We need to make sure GEM understands that we're writing to the BO, in case it needs to synchronize with other rings (blitter use in display server, for example).	2016-02-17 11:19:56 -08:00
Kristian Høgsberg Kristensen	5caa995c32	Revert "anv: Disable snooping for allocator pools again" This reverts commit `c136672c59`. We still have the intermittent missing flush for VkEvent in certain vulkancts cases: piglit.deqp-vk.api.command_buffers.execute_large_primary piglit.deqp-vk.api.command_buffers.submit_count_non_zero, Let's reenable the snooping until we figure out the root cause.	2016-02-16 23:23:49 -08:00
Kristian Høgsberg Kristensen	ecc67f1aac	anv: Make driver and icd file installable Change the name of the .so to libvulkan_intel.so and add an installable icd with the installed paths. Keep the icd file with build-tree paths, but rename to dev_icd.json to make it clear that it's for development purposes.	2016-02-16 23:23:17 -08:00
Kristian Høgsberg Kristensen	4a2d17f606	anv: Revise PhysicalDeviceFeatures and remove FINISHME	2016-02-16 15:43:12 -08:00
Philipp Zabel	ecd1d94d1c	anv: pCreateInfo->pApplicationInfo parameter to vkCreateInstance may be NULL Fix a NULL pointer dereference in anv_CreateInstance in case the pApplicationInfo field of the supplied VkInstanceCreateInfo structure is NULL [1]. [1] https://www.khronos.org/registry/vulkan/specs/1.0/apispec.html#VkInstanceCreateInfo Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>	2016-02-16 14:42:26 -08:00
Jason Ekstrand	48087cfc4e	anv/icd.json: Update the ABI version	2016-02-16 08:02:17 -08:00
Jason Ekstrand	0a3324e66c	anv: Pull Khronos stuff from the README	2016-02-16 07:43:21 -08:00
Kristian Høgsberg Kristensen	a3672a241b	anv/genxml: Include MBO bits for gen7 and gen75	2016-02-15 17:57:03 -08:00
Kristian Høgsberg Kristensen	c2b2ebf1ed	anv: Add missing gen75_cmd_buffer_set_subpass() prototype	2016-02-15 17:40:15 -08:00
Adam Jackson	80ec20351c	anv: Bump to 1.0.3 Probably this should be picked up from <vulkan.h> directly, or we should just assume that any 1.0.x is legal.	2016-02-15 17:38:26 -08:00
Kristian Høgsberg Kristensen	b53edea76c	anv/gen7: Make disabling the FS work We disable the fragment shader for depth/stencil-only pipelines. This commit makes that work for gen7.	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	85f67cf16e	anv: Deduplicate render pass code This lets us share the renderpass code and depth/stencil state code between gen 7 and gen 8.	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	ac4fd0ed21	anv/gen7: Fix pipeline selection in init_device_state() We need the 3D pipeline for the initial setup, not GPGPU.	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	ea694637ac	anv/gen7: Set 3DSTATE_SF depth buffer format correctly We need to pull this from the render pass information at state flush time.	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	18dd59538b	anv/gen7: Call flush_pipeline_select_3d() from CmdBeginRenderPass	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	832f73f512	anv: Share flush_pipeline_select_3d() between gen7 and gen8	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	53eaa0a6b8	anv: Fix warning 3DSTATE_VERTEX_ELEMENTS setup This is a little more subtle. If elem_count is 0, nothing else happens in this function, so we return early to avoid warning about uninitialized 'p'.	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	5d72d7b12d	anv: Fix misc simple warnings	2016-02-15 17:32:07 -08:00
Jason Ekstrand	08ecd8a8d1	anv/meta_resolve: Set origin_upper_left on gl_FragCoord It's required by the spec and any shaders that don't set it will be broken. I'm not really sure how multisampling was even working before...	2016-02-15 12:45:03 -08:00
Jason Ekstrand	88042b9f10	nir: Get rid of the C++ NIR_SRC/DEST_INIT macros These were originally added to reduce compiler warnings but aren't really needed. Getting rid of them reduces the diff between the Vulkan branch and master, so we might as well.	2016-02-12 21:35:02 -08:00
Kristian Høgsberg Kristensen	c136672c59	anv: Disable snooping for allocator pools again The race we were seeing on cherryview was caused by the multi-submit problem with fences. We can now turn snooping off again an rely on clflush and we intended.	2016-02-12 15:11:31 -08:00
Kristian Høgsberg Kristensen	b0c30b77d4	anv: Submit fence bo only after all command buffers We were submitting the fence bo after each command buffer in a multi command buffer submit, causing us to occasionally complete the fence too early.	2016-02-12 15:08:09 -08:00
Kristian Høgsberg Kristensen	39a120aefe	anv: Implement VkPipelineCache We hash the input SPIR-V, specialization constants, entrypoint and the shader key using SHA1 to determine a unique identifier for the combination. A VkPipelineCache is then a hash table mapping these identifiers to the corresponding prog_data and kernel data.	2016-02-12 11:53:49 -08:00
Chad Versace	03bea8fda7	anv/meta_blit: Remove references to clearing Long ago, the blit code used to handle clearing and blitting. - Fix any comments that refer to clearing. - Rename shader var 'attr' to 'tex_pos'. The name 'attr' is an artifact of the time when the shader was used for blitting as well as clearing.	2016-02-12 11:29:29 -08:00
Chad Versace	97b5a07378	anv/meta_blit: Coalesce glsl_vec4_type vars Just a refactor. No behavior change. Several expressions have the same value: they point to glsl_vec4_type(). Coalesce them into a single variable.	2016-02-12 11:29:29 -08:00
Jason Ekstrand	699f21216f	anv/device: clflush simple batches if !LLC	2016-02-12 11:00:42 -08:00
Jason Ekstrand	42155abdd7	anv: Add a clfush_range helper function	2016-02-12 11:00:08 -08:00
Jason Ekstrand	3c8dc1afd1	nir/spirv/glsl: Clean up the row-skipping swizzle logic a bit	2016-02-12 10:40:39 -08:00
Chad Versace	37f4dfb19d	anv/meta: Move blit code to anv_meta_blit.c The clear code lived in anv_meta_clear.c. The resolve code in anv_meta_resolve.c. Only the blit code lived in anv_meta.c, alongside the shareed meta code. This is just a copy-paste patch. No change in behavior.	2016-02-12 09:56:24 -08:00
Chad Versace	cf7fd53850	anv/meta: Hardcode smooth texcoord interpolation in blit shaders Trivial cleanup. No change in behavior. Function argument 'attr_flat', in anv_meta.c:build_nir_vertex_shader(), was always false.	2016-02-12 09:15:58 -08:00
Jason Ekstrand	ea93041ccc	anv/device: Use a normal BO in submit_simple_batch	2016-02-11 21:39:15 -08:00
Jason Ekstrand	3a2b23a447	anv: Add a vk_icdGetInstanceProcAddr entrypoint Aparently there are some issues in symbol resolution if an application packages its own loader and you have a system-installed one. I don't really understand the details, but it's not onorous to add.	2016-02-11 21:20:12 -08:00
Jason Ekstrand	25b09d1b5d	anv/event: Use a 64-bit value The immediate write from PIPE_CONTROL is 64-bits at least on BDW. This used to work on 64-bit archs because the compiler would align the following anv_state struct up for us. However, in 32-bit builds, they overlap and it causes problems.	2016-02-11 19:00:56 -08:00
Jason Ekstrand	3086c5a5e1	gen8/pipeline: Properly set bits in PS_EXTRA for W, depth, and samaple mask	2016-02-11 15:22:18 -08:00
Jason Ekstrand	4016619931	nir/spirv: Allow the clip distance capability.	2016-02-11 15:14:46 -08:00
Jason Ekstrand	da4a6bbbea	gen8/pipeline: Pull gs_vertex_count from prog_data	2016-02-11 15:13:54 -08:00
Jason Ekstrand	ff8895ba56	Merge remote-tracking branch 'mesa-public/master' into vulkan	2016-02-11 15:09:30 -08:00
Kristian Høgsberg Kristensen	2009e304f7	anv/pack: Handle case where a struct field covers multiple dwords We also didn't add start to field.end to get the absolute field end position.	2016-02-10 22:36:38 -08:00
Jason Ekstrand	f710f3ca37	Merge remote-tracking branch 'mesa-public/master' into vulkan This also reverts commit `1d65abfa58` because now NIR handles texture offsets in a much more sane way.	2016-02-10 17:12:11 -08:00
Jason Ekstrand	7ef3e47c27	Merge commit '85f5c18fef1ff2f19d698f150e23a02acd6f59b9' into vulkan	2016-02-10 17:09:56 -08:00
Kristian Høgsberg Kristensen	d2623a3247	anv: Handle dwords that are all MBZ correctly A few packets have dwords in them that are all MBZ and we failed to write those. This change makes sure we iterate through all dwords and write them all.	2016-02-10 16:36:47 -08:00
Kristian Høgsberg Kristensen	09bb7ea4b7	anv: Fix out-of-tree build We need to be able to find the generated nir_opcodes.h header.	2016-02-10 15:54:28 -08:00
Kristian Høgsberg Kristensen	9cc939d82f	nir: Fix out-of-tree build for spirv2nir This needs to be able to find the generated nir_opcodes.h header.	2016-02-10 15:54:28 -08:00
Jason Ekstrand	9be5a4bc29	nir/spirv: Fix handling of OpGroupMemberDecorate We were pulling the member index from the wrong dword	2016-02-10 15:36:42 -08:00
Jason Ekstrand	ac04c6de2c	nir/spirv: Assert that struct member ids are in-bounds	2016-02-10 15:36:41 -08:00
Mark Janes	8179834030	nir/spirv: fix build_mat_subdet stack smasher The sub-determinate implementation pattern fixed by `6a7e2904e0` has a second instance in the same file. With the previous algorithm, when row and j are both 3, the index overruns the array. This only impacts the stack on 32 bit builds. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-10 14:43:03 -08:00
Kristian Høgsberg Kristensen	51c01e292c	anv: Generate pack headers from XML definition This huge commit switches us over to using a simple xml format (genxml) for defining our command streamer commands and a python script for generating the pack headers we use in the driver.	2016-02-10 14:31:26 -08:00
Jason Ekstrand	09b3e30dc6	anv: Fix up spirv for new texture/sampler split stuff	2016-02-09 16:48:36 -08:00
Jason Ekstrand	b14f4c1fd3	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in the separate texture/sampler stuff from upstream	2016-02-09 16:47:37 -08:00
Jason Ekstrand	e01dd59b73	vtn: Use const_index helpers	2016-02-09 16:32:38 -08:00
Jason Ekstrand	e15f7551d1	anv/apply_pipeline_layout: Use the new const_index helpers	2016-02-09 16:32:38 -08:00
Jason Ekstrand	768bd7f272	Merge commit '8b0fb1c152fe191768953aa8c77b89034a377f83' into vulkan This pulls in Rob Clark's const_index changes for NIR	2016-02-09 15:30:39 -08:00
Chad Versace	4c5dcccfba	anv/image: Fix usage for depthstencil images The tests assertion-failed in vkCmdClearDepthStencilImage because the isl surface lacked ISL_SURF_USAGE_DEPTH_BIT. Fixes: https://gitlab.khronos.org/vulkan/mesa/issues/26 Fixes: dEQP-VK.pipeline.timestamp.transfer_tests.host_stage_with_clear_depth_stencil_image_method Fixes: dEQP-VK.pipeline.timestamp.transfer_tests.transfer_stage_with_clear_depth_stencil_image_method	2016-02-09 12:54:30 -08:00
Chad Versace	c5e521f391	anv/image: Refactor choose_isl_surf_usage() - Rename local var isl_flags -> isl_usage. - Fix comment.	2016-02-09 12:54:30 -08:00
Chad Versace	2f4bb00c2b	anv/image: Fix choose_isl_surf_usage() Don't translate VkImageCreateInfo::usage into an isl_surf_usage bitmask. Instead, translate anv_image::usage, which is a superset of VkImageCreateInfo::usage. For-Issue: https://gitlab.khronos.org/vulkan/mesa/issues/26	2016-02-09 12:54:04 -08:00
Chad Versace	bdab29a312	isl: Add more assertions to isl_surf_get_depth_format() R16_UNORM and R32_FLOAT are illegal formats for interleaved depthstencil surfaces.	2016-02-09 11:40:08 -08:00
Jason Ekstrand	1d65abfa58	nir/spirv: Better handle constant offsets in texture lookups	2016-02-09 10:29:05 -08:00
Jason Ekstrand	209820739b	nir/spirv: Set the vtn_mode and interface type for sampler parameters	2016-02-09 10:29:05 -08:00
Jason Ekstrand	de6c9c5f2e	nir/inline_functions: Don't shadown variables when it isn't needed Previously, in order to get things working, we just always shadowed variables. Now, we rewrite derefs whenever it's safe to do so and only shadow if we have an in or out variable that we write or read to respectively.	2016-02-09 10:29:05 -08:00
Jason Ekstrand	b6c00bfb03	nir: Rework function parameters	2016-02-09 10:29:05 -08:00
Jason Ekstrand	a485567d3a	anv/WSI/X11: Use the right allocator for freeing swapchains	2016-02-09 10:29:05 -08:00
Chad Versace	e6d3432c81	anv: Replace anv_format::depth_format with ::has_depth isl now understands depth formats. We no longer need depth formats in the anv_format table.	2016-02-09 10:02:50 -08:00
Chad Versace	0a93067993	isl: Add func isl_surf_get_depth_format() For depth surfaces, it gets the value for 3DSTATE_DEPTH_BUFFER.SurfaceFormat.	2016-02-09 10:02:50 -08:00
Chad Versace	4d037b551e	anv: Rename anv_format::surface_format -> isl_format Because that's what it is, an isl format.	2016-02-09 10:02:50 -08:00
Francisco Jerez	cec6fe2ad8	vtn: Clean up acos implementation. Parameterize build_asin() on the fit coefficients so the implementation can be shared while still using different polynomials for asin and acos. Also switch back to implementing acos in terms of asin -- The improvement obtained from cancelling out the pi/2 terms was negligible compared to the approximation error.	2016-02-08 15:23:43 -08:00
Francisco Jerez	f50a651726	nir/spirv: Create integer types of correct signedness. vtn_handle_type() creates a signed type regardless of the value of the signedness flag, which usually doesn't make much of a difference except when the type is used as base sampled type of an image type, what will cause the base type of the NIR image variable to be inconsistent with its format and cause an assertion failure in the back-end (most likely only reproducible on Gen7), and may change the semantics of the image intrinsic subtly (e.g. UMIN may become IMIN).	2016-02-08 15:23:35 -08:00
Kristian Høgsberg Kristensen	6c4c04690f	anv: Deduplicate dispatch calls This can all be shared between gen8+ and pre-gen8.	2016-02-05 22:36:53 -08:00
Kristian Høgsberg Kristensen	bdefaae2b9	anv: Deduplicate anv_CmdDraw calls These were all duplicated between gen7_cmd_buffer.c and gen8_cmd_buffer.c. This commit consolidates both copies in genX_cmd_buffer.c.	2016-02-05 16:41:56 -08:00
Kristian Høgsberg Kristensen	6cdada0360	anv: Move invariant state to small initial batch We use the simple batch helper to submit a batch at driver startup time which holds all the state that never changes. We don't have a whole lot and once we enable tesselation there'll be even less. Even so, it's a simple mechanism and reduces our steady state batch sizes a bit.	2016-02-05 16:13:53 -08:00
Kristian Høgsberg Kristensen	c9c3344c4f	anv: Split out batch submit helper from anv_DeviceWaitIdle We'll reuse this mechanism in the next commit.	2016-02-05 16:13:52 -08:00
Kristian Høgsberg Kristensen	381d85545a	anv: Share scratch_space helper between gen7 and gen8+ The gen7 pipeline has a useful helper function for this, let's use it in gen8_pipeline.c too. The gen7 function has an off-by-one bug though: we have to compute log2(size / 1024) - 1, but we divide by 2048 instead so as to avoid the case where size is less than 1024 and we'd return -1.	2016-02-05 16:13:52 -08:00
Kristian Høgsberg Kristensen	d1617dbec3	anv: Share URB setup between gen7 and gen8+	2016-02-05 16:13:52 -08:00
Jason Ekstrand	9401516113	Merge remote-tracking branch 'mesa-public/master' into vulkan	2016-02-05 15:21:11 -08:00
Jason Ekstrand	741744f691	Merge commit mesa-public/master into vulkan This pulls in the patches that move all of the compiler stuff around	2016-02-05 15:03:44 -08:00
Jason Ekstrand	9645b8eb1f	Merge branch mesa-public/master into vulkan	2016-02-05 14:21:13 -08:00
Chad Versace	3eebf3686b	anv: Drop anv_image::needs__surface_state anv_image::needs_sampler_surface_state was a redundant member, identical to (usage & VK_IMAGE_USAGE_SAMPLED_BIT). Likewise for the other needs_ members.	2016-02-04 12:20:51 -08:00
Chad Versace	42b9320fbf	anv/image: Rename nonrt_surface_state Let's call it what it is, not what it is not. Rename it to 'sampler_surface_state'.	2016-02-04 12:20:51 -08:00
Jason Ekstrand	1f5d56304f	anv/descriptor_set: Fix descriptor copies We weren't pulling the actual binding location information out of the set layout. The new code mirrors the descriptor write code.	2016-02-03 22:44:33 -08:00
Mark Janes	6a7e2904e0	nir/spirv: fix build_mat4_det stack smasher When generating a sub-determinate matrix, a 3-element swizzle array was indexed with clever inline boolean logic. Unfortunately, when i and j are both 3, the index overruns the array, smashing the next variable on the stack. For 64 bit builds, the alignment of the 3-element unsigned array leaves 32 bits of spacing before the next local variable, hiding this bug. On i386, a subcolumn pointer was smashed then dereferenced.	2016-02-02 15:30:54 -08:00
Mark Janes	ea8c2d118a	anv: Fix anv_descriptor_set reference error on deletion anv_descriptor_set_destroy uses the descriptor sets's set_layout member to iterate the set's buffer views. However, the set_layout reference may have previously been freed. On 64 bit builds, this bug generated valgrind errors but did not affect CTS test results. On 32 bit builds, it reliably produces assertions and memory corruption.	2016-02-02 15:28:01 -08:00
Kristian Høgsberg Kristensen	5a06bac4a0	anv: Use @LIB_DIR@ in anv_icd.json Otherwise we may get a lib vs lib64 mismatch.	2016-02-02 14:36:22 -08:00
Jason Ekstrand	fd99f3d658	anv/device: Improve version error reporting	2016-02-02 13:16:13 -08:00
Jason Ekstrand	c7f26bbed9	vulkan: Bump the header to 1.0.3	2016-02-02 13:08:47 -08:00
Jason Ekstrand	0d2145b50f	anv/fence: Default to not ready This is kind-of silly. We really need to do a better job of making sure all objects have all their default values set. We probably also want to, eventually, put everything into the BO (to save memory) and, more specifically, make the GPU write the "ready" flag. That way GetFenceStatus won't ever have to call into the kernel.	2016-02-02 12:22:03 -08:00
Mark Janes	ac0589b213	i965: fix unsigned long overflows for i386 bit-shifts on 32 bit unsigned longs overflow in several places. The intention was for 64 bit integers to be used.	2016-02-01 14:52:22 -08:00
Jason Ekstrand	8776d3cb8e	nir/spirv: Fix UBO loads of a single element of a row-major matrix	2016-02-01 14:03:05 -08:00
Jason Ekstrand	499f7c2f0b	nir/spirv: Handle the LOD parameter of OpImageQuerySizeLod	2016-02-01 14:03:05 -08:00
Jason Ekstrand	b1a1623293	nir/spirv: Add support for SpvOpImage	2016-02-01 14:03:05 -08:00
Jason Ekstrand	593f88c0db	nir/spirv: Fix the UBO loading case of a single row-major matric column	2016-02-01 14:03:05 -08:00
Jason Ekstrand	abc0e5c1b8	nir/spirv: Fix the UBO loading case of a single row-major matric column	2016-02-01 13:26:59 -08:00
Jason Ekstrand	2d2c6fc6bb	anv/wsi/wayland: Advertise sRGB	2016-02-01 13:06:35 -08:00
Jason Ekstrand	443c578bca	anv/wsi/x11: Expose SRGB all the time After a long discussion with Eric Anholt and Owen Taylor, I learned that X11 is basically always sRGB as that's what the scanout hardware does and X doesn't modify anything. Therefore, we should just always expose sRGB formats.	2016-02-01 13:06:35 -08:00
Chad Versace	afb327a985	anv: Structify a one-member union anv_descriptor contained a union with one member.	2016-02-01 12:18:10 -08:00
Kristian Høgsberg Kristensen	dc5fdcd6b7	anv: Advertise robustBufferAccess The GPU does most of this for us as long as we set up tight bounds for the buffers, which we do. Additionally, we range check dynamically buffers in the shader. With that it's safe to turn on robustBufferAccess.	2016-02-01 12:00:05 -08:00
Chad Versace	ffbc32f8d9	anv/meta: Strip trailing whitespace	2016-02-01 10:51:01 -08:00
Chad Versace	aa5e257860	anv: Update MSAA status in README	2016-02-01 10:46:24 -08:00
Jason Ekstrand	a88b1eeb13	Update the README	2016-02-01 06:10:51 -08:00
Jason Ekstrand	ea63663a72	wsi/x11: Remove B8G8R8_UNORM We don't actually support that format yet because ISL doesn't have an enum for it. We need to beef up the formats table to allow for tiled-only formats.	2016-02-01 06:00:50 -08:00
Jordan Justen	f96a6c65a3	anv/gen7: Rename gen7_batch_lr* to emit_lr* Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 15:06:03 -08:00
Jordan Justen	b207a6b5aa	anv/gen7: Set BypassGatewayControl in MEDIA_VFE_STATE Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 15:06:03 -08:00
Jordan Justen	2d8726a4b7	anv/genX_pipeline: Remove unnecessary #include files Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:30:54 -08:00
Jordan Justen	8e48ff3ad6	anv/gen7: Set SLM size in interface descriptor Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:10:54 -08:00
Jordan Justen	ab0d8608d2	anv: Support MEDIA_VFE_STATE for gen7 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:08:34 -08:00
Jordan Justen	dd2effb0e7	anv/gen7: Subtract 1 from num_elements when setting up buffer surface state `e8f51fe4` for gen7 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:00:00 -08:00
Jordan Justen	4bb1e7937a	anv/gen7: Disable fs dispatch for depth/stencil only pipelines `292031a` for gen7 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:00:00 -08:00
Jordan Justen	f5b3a2fe32	anv/gen7: Add support for gl_NumWorkGroups Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:00:00 -08:00
Jordan Justen	7e46cc8603	anv/gen7/compute: Setup push constants and local ids Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:00:00 -08:00
Jordan Justen	b1158ced45	anv/genX: Add genX_pipeline.c for compute_pipeline_create Adds initial compute_pipeline_create implementation for gen7. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 08:58:11 -08:00
Jason Ekstrand	1a442a7923	Merge branch 'vulkan' into 'vulkan' Vulkan WSI Wayland fixes Two small fixes to make mailbox mode actually work again. See merge request !4	2016-01-30 10:28:12 -05:00
Jason Ekstrand	c668dc9f75	anv/pass: Initialize has_resolve	2016-01-30 07:16:33 -08:00
Jason Ekstrand	ad813b072a	anv/wsi: Set the platform field of VkIcdSurfaceBase	2016-01-30 07:05:53 -08:00
Jason Ekstrand	5acc4e2ebf	anv/wsi/x11: Actually pull information from the window's visual	2016-01-30 03:51:47 -08:00
Jason Ekstrand	66e8b5cf2b	anv/wsi/x11: Actually check for DRI3	2016-01-30 03:50:31 -08:00
Jason Ekstrand	44ec860cd6	anv/WSI: Support more usage bits They're just images and we have no intention of stompping alpha channels (at least not yet), so there's no reason why you can't sample.	2016-01-29 20:52:44 -08:00
Jason Ekstrand	337c1e0871	anv/formats: Add more compressed formats This adds support for the DX compression formats. Given that ETC and EAC are working fine, these should be ok too.	2016-01-29 20:46:31 -08:00
Jason Ekstrand	c688e4db11	anv/wsi: Rework to be compatable with the loader	2016-01-29 20:39:21 -08:00
Jason Ekstrand	d4953fb340	vulkan: Import vk_icd.h	2016-01-29 20:37:45 -08:00
Jason Ekstrand	a19ceee46c	anv/device: Fix version check The bottom-end check was wrong so it was only working on <= 1.0.0. Oops.	2016-01-29 20:36:58 -08:00
Kristian Høgsberg Kristensen	f28645f71c	anv: Don't disable snooping for mempools There's an intermittent flushing problem with VkEvent that we need to root cause. For now, using the snooping feature keeps the memory pools up to date with GPU writes and fixes the problem.	2016-01-29 17:19:51 -08:00
Kristian Høgsberg Kristensen	0c4ef36360	anv: clflush is only orderered against mfence We can't use the more fine-grained load and store fence commands (lfence and mfence), since clflush is only guaranteed to be ordered with respect to mfence.	2016-01-29 14:56:41 -08:00
Kristian Høgsberg Kristensen	31d3486bd2	anv: Limit flushing to the range of mapped memory	2016-01-29 14:56:41 -08:00
Ben Widawsky	89ec36f221	anv/cmd_buffer: Emit gen9 style SF state for CHV The state for line width changes on Cherryview to use the GEN9 bits (for extra precision).	2016-01-29 14:12:32 -08:00
Ben Widawsky	31508bd0ce	anv/gen8: Extract SF state For upcoming patch to address difference in Cherryview.	2016-01-29 14:11:53 -08:00
Chad Versace	f8a4abcd15	anv: Do resolves at end of subpass	2016-01-28 10:49:50 -08:00
Chad Versace	bef8456ede	anv/meta: Remove unneeded resolve pipeline Vulkan does not allow resolving a single-sample image. So remove that pipeline from anv_meta_state::resolve::pipelines.	2016-01-28 10:45:11 -08:00
Chad Versace	ac5594fa71	anv/meta_resolve: Remove redundant initialization params	2016-01-28 10:14:39 -08:00
Chad Versace	142da00486	anv: Drop const on anv_framebuffer::attachments The attachments should be const, but the driver's function signatures are generally not const-friendly. Drop the const because it conflicts with upcoming anv_cmd_buffer_resolve_subpass().	2016-01-28 10:03:00 -08:00
Chad Versace	22258e279d	anv: Add anv_subpass::has_resolve Indicates that the subpass has at least one resolve attachment.	2016-01-28 10:03:00 -08:00
Chad Versace	3d863e8dad	anv/meta_resolve: Save/Restore viewport and scissor	2016-01-28 10:03:00 -08:00
Chad Versace	8487569fa7	anv/meta_resolve: Begin pass outside emit_resolve() This refactor is preparation for handling subpass resolve attachments.	2016-01-28 10:03:00 -08:00
Chad Versace	2bab3cd681	anv/image: Update usage flags for multisample images Meta resolves multisample images by binding them as textures. Therefore we must add VK_IMAGE_USAGE_SAMPLED_BIT.	2016-01-28 10:03:00 -08:00
Jason Ekstrand	608b411e9f	anv/device: Add a better version check. We now check that the requested version is precicely within the range of versions that we support.	2016-01-28 08:19:40 -08:00
Jason Ekstrand	6286a74f6b	anv/device: Advertise 1.0.2	2016-01-27 22:02:51 -08:00
Jason Ekstrand	ec80d6388a	anv/formats: Properly set FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT This was added last minute and the API bumped to 1.0.2.	2016-01-27 22:02:06 -08:00
Jason Ekstrand	ac75746448	vulkan.h: Update to 1.0.2	2016-01-27 21:59:00 -08:00
Jason Ekstrand	c64bc5463d	anv/device: Improve the api version check to allow 1.0.X	2016-01-27 21:56:46 -08:00
Francisco Jerez	4604b2871a	vtn: Improve accuracy of acos approximation. The adjusted polynomial coefficients come from the numerical minimization of the L2 norm of the relative error. The old coefficients would give a maximum relative error of about 15000 ULP in the neighborhood around acos(x) = 0, the new ones give a relative error bounded by less than 2000 ULP in the same neighborhood.	2016-01-27 19:55:21 -08:00
Jason Ekstrand	7fb35a8228	An alternate arccosine implementation	2016-01-27 19:55:21 -08:00
Jason Ekstrand	983db2b804	anv/meta_resolve: Fix a bug in the meta pipeline destroy path	2016-01-27 19:48:43 -08:00
Chad Versace	9b240a1e3d	anv/skl: Fix crash in 16x multisampling We built meta clear and resolve pipelines for only up to 8x samples. There were no 16x pipelines.	2016-01-27 18:38:15 -08:00
Chad Versace	61d3d49820	anv: Fix comment for anv_meta_state arrays Array element i is for 2^i samples, not log2(i) samples.	2016-01-27 18:32:05 -08:00
Ben Widawsky	2af3281fee	anv/push constants: Use constant buffer #2 SKL has a workaround which requires either some weird programming of buffer 3, OR, just never using buffer 0. Since we don't actually use multiple constant buffers, it's easier to just not use 0. Only SKL requires this workaround, but there is no harm in applying it to all platforms. The big change here is that buffer #0 is relative to dynamic state base normally (depending upon ISTPM), where buffer 1-3 is a GPU virtual address.	2016-01-27 17:09:36 -08:00
Chad Versace	5d4f3298ae	anv/meta: Implement multisample clears	2016-01-27 17:01:59 -08:00
Chad Versace	eb6fb65fd1	anv/meta: Simplify failure handling during clear init Remove all the fine-grained cleanup in anv_device_init_meta_clear_state(). Instead, if anything fails during initialization, simply call anv_device_finish_meta_clear_state() and let it clean up the partially initialized anv_meta_state::clear.	2016-01-27 17:01:56 -08:00
Chad Versace	4085f1f230	anv/meta: Implement vkCmdResolveImage This handles multisample color images that have a floating-point or normalized format and have a single array layer. This does not yet handle integer formats nor multisample array images.	2016-01-27 16:55:59 -08:00
Chad Versace	8cc1e59d61	anv/meta: Add func anv_meta_get_iview_layer() This function is just meta_blit_get_dest_view_base_array_slice(), but moved to the shared header anv_meta.h. Will be needed by anv_meta_resolve.c.	2016-01-27 16:52:30 -08:00
Chad Versace	8cc6f058ce	anv/gen8: Begin enabling pipeline multisample state As far as I can tell, this patch sets all pipeline multisample state except: - alpha to coverage - alpha to one - the dispatch count for per-sample dispatch	2016-01-27 16:52:27 -08:00
Chad Versace	57e4a5ea99	anv/gen8: Set multisample surface state	2016-01-27 16:48:20 -08:00
Chad Versace	9b3d660878	anv/meta: Merge anv_meta_clear.h into anv_meta.h The header was too small.	2016-01-27 16:48:20 -08:00
Kenneth Graunke	32e4c5ae30	vtn: Make tanh implementation even stupider The dEQP "precision" test tries to verify that the reference functions float sinh(float a) { return ((exp(a) - exp(-a)) / 2); } float cosh(float a) { return ((exp(a) + exp(-a)) / 2); } float tanh(float a) { return (sinh(a) / cosh(a)); } produce the same values as the built-ins. We simplified away the multiplication by 0.5 in the numerator and denominator, and apparently this causes them not to match for exactly 1 out of 13,632 values. So, put it back in, fixing the test, but making our code generation (and precision?) worse.	2016-01-27 15:34:50 -08:00
Jason Ekstrand	8f0ef9bbeb	nir/opt_algebraic: Use a more elementary mechanism for lowering ldexp	2016-01-27 15:21:28 -08:00
Jason Ekstrand	f7d6b8ccfe	gen8/state: Fix QPitch for compressed textures on Broadwell	2016-01-27 15:12:42 -08:00
Jason Ekstrand	162c662585	anv/image: Use the entire image height for compressed meta blits	2016-01-27 15:12:42 -08:00
Nanley Chery	235abfb7e6	anv/image: Enlarge the image level 0 extent The extent previously was supposed to match the mip at a given level under the assumption that the base address would be that of the mip as well. Now however, the base address only matches the offset of the containing tile. Therefore, enlarge the extent to match that of phys_slice0, so that we don't draw/fetch in out of bounds territory. This solution isn't perfect because the base adress isn't always at the first tile, therefore the assumed valid memory region by the HW contains some number of invalid tiles on two edges.	2016-01-27 15:12:42 -08:00
Jason Ekstrand	96cf5cfee1	anv/image: Minify before dividing by block dimensions	2016-01-27 15:12:42 -08:00
Jason Ekstrand	1bea1eff38	anv/meta: Don't double-call choose_buffer_format This fixes all the renderpass tests	2016-01-27 15:12:42 -08:00
Nanley Chery	dd22b5c914	anv/meta: Modify make_image_for_buffer()'s image Always use a valid buffer format and convert the extent to units of elements with respect to original image format.	2016-01-27 15:12:42 -08:00
Nanley Chery	d3c1fd53e2	anv/image: Use custom VkBufferImageCopy for iview initialization Use a custom VkBufferImageCopy with the user-provided struct as the base. A few fields are modified when the iview is uncompressed and the underlying image is compressed.	2016-01-27 15:12:42 -08:00
Nanley Chery	6a579ded87	anv: Add offset parameter to anv_image_view_init() This is the offset of the tile that contains the mip specified by isl_surf_get_image_intratile_offset_el(). Used to draw to/from the specified mip.	2016-01-27 15:12:42 -08:00
Nanley Chery	4a0075feeb	anv/meta: Calculate mip offset for compressed texture This value will be used in a later commit.	2016-01-27 15:12:42 -08:00
Nanley Chery	1c87cb51be	anv/meta: Disambiguate slice variable value This will simplify the usage of isl_surf_get_image_intratile_offset_el().	2016-01-27 15:12:42 -08:00
Nanley Chery	8c0c25abde	gen8_state: use iview extent to program RENDER_SURFACE_STATE When creating an uncompressed ImageView on an compressed Image, the SurfaceFormat is updated to match the ImageView's. The surface dimensions must also change so that the HW sees the same size image instead of a 4x larger one. Fixes the following error which results from running many VulkanCTS compressed tests in one shot: ResourceError (vk.queueSubmit(queue, 1, &submitInfo, *m_fence): VK_ERROR_OUT_OF_DEVICE_MEMORY at vktPipelineImageSamplingInstance.cpp:921) Makes all compressed format tests with a height > 1 pass.	2016-01-27 15:12:42 -08:00
Nanley Chery	3f01bbe7f3	anv/image: Scale iview extent by backing image Aligns with formula's presented in Vulkan spec concerning CopyBufferToImage. 18.4 Copying Data Between Buffers and Images This won't conflict with valid API usage, because: 1) Users are not allowed to create an uncompressed ImageView with a compressed Image. see: VkSpec - 11.5 Image Views - VkImageViewCreateInfo's Valid Usage box 2) If users create a differently formatted compressed ImageView with a compressed Image, the block dimensions will still match. see: VkSpec - 28.3.1.5 Format Compatibility Classes - Table 28.5	2016-01-27 15:12:42 -08:00
Nanley Chery	010ab34839	anv/meta: Set depth to 0 for buffer image in CopyBufferToImage() The buffer image is a flat 2D surface. Each surface represents an array/depth layer, therefore, the Z-offset is 0 when blitting.	2016-01-27 15:12:42 -08:00
Nanley Chery	2fb8b859f6	anv/meta: Use the uncompressed rectangle when blitting For an uncompressed ImageView of a compressed Image, the dimensions and offsets are all divided by the appropriate block dimensions. We are not yet using an uncompressed ImageView for a compressed Image, but will do so in a future commit.	2016-01-27 15:12:42 -08:00
Nanley Chery	c3546685ed	i965: Update the surface_format table for ETC formats Enable ETC support for BDW+. In Vulkan, an array lookup on surface_format[] is used to determine HW support for certain formats. In contrast, Mesa dynamically populates an array which reports this information.	2016-01-27 15:12:42 -08:00
Nanley Chery	308ec0279b	anv/image: Update usages of isl_surf_get_image_offset_sa	2016-01-27 15:12:42 -08:00
Nanley Chery	02629a16d1	isl: Add logical z offset to GEN4_2D surfaces 3D surfaces in Skylake are stored with ISL_DIM_LAYOUT_GEN4_2D. Any delta in the logical z offset causes an equivalent delta in the surface's array layer.	2016-01-27 15:12:42 -08:00
Chad Versace	a6ecfe1dd3	isl/tests: Add some tests for intratile offsets Test isl_surf_get_image_intratile_offset_el() in the tests: test_bdw_2d_r8g8b8a8_unorm_512x512_array01_samples01_noaux_tiley0 test_bdw_2d_r8g8b8a8_unorm_1024x1024_array06_samples01_noaux_tiley0	2016-01-27 15:12:42 -08:00
Chad Versace	7ab0d2e2c0	isl: Add func isl_get_intratile_image_offset_el()	2016-01-27 15:12:42 -08:00
Chad Versace	18a83eaa8c	isl/tests: Rename t_assert_offset() Rename it to t_assert_offset_el(), clarifying that the offset in units of surface elements, not samples.	2016-01-27 15:12:42 -08:00
Chad Versace	fa08f95ff5	isl: Add func isl_surf_get_image_offset_el() This replaces function isl_surf_get_image_offset_sa()	2016-01-27 15:12:42 -08:00
Chad Versace	ea44d31528	isl: Fix row pitch for compressed formats When calculating row pitch, the row's width in samples must be divided by the format's block width. The commit below accidentally removed the division. commit `eea2d4d059` Author: Chad Versace <chad.versace@intel.com> Date: Tue Jan 5 14:28:28 2016 -0800 Subject: isl: Don't align phys_slice0_sa.width twice	2016-01-27 15:12:42 -08:00
Chad Versace	45ecfcd637	isl: Add func isl_surf_get_tile_info()	2016-01-27 15:12:42 -08:00
Kenneth Graunke	9f954310e8	vtn: Fix atan2 for non-scalars. The if/then/else block was bogus, as it can only take a scalar condition, and we need to select component-wise. The GLSL IR implementation of atan2 handles this by looping over components, but I decided to try and do it vector-wise, and messed up. For now, just bcsel. It means that we do the atan1 math even if all components hit the quick case, but it works, and presumably at least one component will hit the expensive path anyway.	2016-01-27 15:07:42 -08:00
Kenneth Graunke	f92a35d831	vtn: Fix Modf. We were botching this for negative numbers - floor of a negative rounds the wrong way. Additionally, both results are supposed to retain the sign of the original. To fix this, just take the abs of both values, then put the sign back. There's probably a better way to do this, but this works for now.	2016-01-27 14:21:08 -08:00
Kenneth Graunke	4acfc9effb	i965: Fix SIN/COS precision problems. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-27 13:56:54 -08:00
Kristian Høgsberg Kristensen	b833e7a63c	anv: Put back code to grow shader scratch space This was lost in commit `a71e614d33`.	2016-01-27 11:36:56 -08:00
Kenneth Graunke	38a3a535eb	anv: Update the device limits. Fixes dEQP-VK.api.info.device.properties. I haven't tested any others.	2016-01-26 23:09:45 -08:00
Jason Ekstrand	d3607351fe	gen7/cmd_buffer: SCISSOR_RECT structs are tightly packed The pointer has to be 32-byte aligned, but the structs themselves are 2 dwords each, tightly packed.	2016-01-26 22:10:14 -08:00
Jason Ekstrand	f2f03c5b65	anv/pipeline: Set MaximumVPIndex in 3DSTATE_CLIP	2016-01-26 21:52:59 -08:00
Jason Ekstrand	dc3de6f8df	anv/pipeline: Only lower input indirects if EmitNoIndirectInput is set	2016-01-26 21:45:21 -08:00
Jason Ekstrand	9ac624751e	anv/formats: Use is_power_of_two instead of is_rgb to determine renderability	2016-01-26 20:29:16 -08:00
Jason Ekstrand	2af3acd061	HACK/i965/surface_formats: Mark A4B4G4R4 as being supported The table has this marked as unsupported on all gens, but I don't really believe that given how early it is in the table. I've tested and it seems to work on Broadwell. The Bspec says that it sould be renderable on SKL+ but alpha blending is questionable. Side note: We really need to audit the format table again.	2016-01-26 20:29:16 -08:00
Jordan Justen	c20f78dc5d	anv: Support swizzled formats. Some formats require a swizzle in order to map them to actual hardware formats. This allows us to turn on two new Vulkan formats.	2016-01-26 20:29:16 -08:00
Jason Ekstrand	9bc72a9213	anv/image: Do swizzle remapping in anv_image.c TODO: At some point, we really need to make an image_view_init_info that's a flyweight and stop stuffing everything into image_view.	2016-01-26 20:23:59 -08:00
Jason Ekstrand	7d84fe9b1f	HACK: Expose support for stencil blits If someone actually tries to use them, they won't work, but at least we don't fail to return format properties now.	2016-01-26 17:29:49 -08:00
Kenneth Graunke	32dcfc953e	vtn: Delete references to IMix opcode. This is being removed in SPIR-V. Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=15452	2016-01-26 17:02:35 -08:00
Ben Widawsky	c5dc6cdf26	i965/skl: Utilize new 5th bit for gateway messages Cc: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-01-26 15:44:48 -08:00
Jason Ekstrand	a1ea45b857	genX/pipeline: Don't make vertex bindings with holes	2016-01-26 15:44:18 -08:00
Jason Ekstrand	7ef0d39cb2	anv/cmd_buffer: Put base_instance in the second component	2016-01-26 15:44:02 -08:00
Francisco Jerez	6840cc1513	anv/image: clflush surface state map in anv_fill_buffer_surface_state(). Some of its users had the required clflush on non-LLC platforms, some didn't. Put the clflush in anv_fill_buffer_surface_state() so we don't forget.	2016-01-26 15:14:50 -08:00
Francisco Jerez	fc7a7b31c5	anv/image: clflush the right state map in anv_fill_image_surface_state(). It was clflushing the nonrt_surface_state structure regardless of which state structure was actually being initialized.	2016-01-26 15:14:50 -08:00
Francisco Jerez	a50dc70e21	anv/image: Upload raw buffer surface state for untyped storage image and texel buffer access.	2016-01-26 15:14:50 -08:00
Francisco Jerez	d2ec510dda	anv/image: Fix image parameter initialization.	2016-01-26 15:14:50 -08:00
Francisco Jerez	d9e0b9a06a	isl/gen9: Fix slice offset calculation for 1D array images. The X component of the offset is set to the layer index times layer height which is obviously bogus, return the vertical offset of the slice as Y component instead. Fixes a few image load/store tests that use 1D arrays on SKL when forcing it to fall back to untyped reads and writes.	2016-01-26 15:14:50 -08:00
Jason Ekstrand	cc065e0ad7	i965/fs_surface_builder: Mask signed integers after conversion	2016-01-26 15:14:50 -08:00
Jason Ekstrand	ba393c9d81	anv/image: Actually fill out brw_image_param structs	2016-01-26 15:14:50 -08:00
Jason Ekstrand	aa9987a395	anv/image_view: Add base mip and base layer fields These will be needed by image_load_store	2016-01-26 15:14:50 -08:00
Jason Ekstrand	42cd994177	gen7: Add support for base vertex/instance	2016-01-26 14:56:37 -08:00
Jason Ekstrand	4bf3cadb66	gen8: Add support for base vertex/instance	2016-01-26 14:56:37 -08:00
Jason Ekstrand	6ba67795db	nir/spirv: Add proper support for InstanceIndex	2016-01-26 14:56:37 -08:00
Jason Ekstrand	1c3b7fe1ee	nir/lower_io: Lower INSTNACE_INDEX	2016-01-26 14:56:37 -08:00
Jason Ekstrand	b2b7c93318	glsl/enums: Add an enum for Vulkan instance index	2016-01-26 14:56:37 -08:00
Jason Ekstrand	da75492879	genX/pipeline: Break emit_vertex_input out into common code It's mostly the same and contains some non-trivial logic, so it really should be shared. Also, we're about to make some modifications here that we would really like to share.	2016-01-26 14:56:37 -08:00
Kristian Høgsberg Kristensen	fe6ccb6031	anv: Remove long unused anv_aub.h	2016-01-26 14:53:00 -08:00
Kristian Høgsberg Kristensen	074a7c7d7c	anv: Dirty fragment shader descriptors in meta restore We need to reemit render targets, so dirtying VK_SHADER_STAGE_VERTEX_BIT doesn't help us much.	2016-01-26 14:44:02 -08:00
Kristian Høgsberg Kristensen	725d969753	anv: Reemit STATE_BASE_ADDRESS after second level cmd buffers Otherwise the primary batch will continue using the state base addresses set by the secondary. Fixes remaining renderpass tests.	2016-01-26 14:44:02 -08:00
Chad Versace	df5f6d824b	anv/meta: Fix sample mask in clear pipelines Once we begin emitting the correct sample mask, genX_3DSTATE_SAMPLE_MASK_pack will hit an assertion if the mask contains too many bits.	2016-01-26 11:04:44 -08:00
Jason Ekstrand	725fb3623f	i965/compiler: Set nir_options.vertex_id_zero_based	2016-01-25 16:10:28 -08:00
Jason Ekstrand	6b6a8a99f8	HACK/i965: Default to scalar GS on BDW+	2016-01-25 15:52:53 -08:00
Jason Ekstrand	e462d4d815	Merge remote-tracking branch 'mattst88/nir-lower-pack-unpack' into vulkan	2016-01-25 15:50:31 -08:00
Jason Ekstrand	6bbf3814dc	gen7/state: Apply min/mag filters individually for samplers This fixes tests which apply different min and mag filters, and depend on the min filter to be correct.	2016-01-25 15:33:08 -08:00
Ben Widawsky	9c69f4632d	gen8/state: Apply min/mag filters individually for samplers This fixes tests which apply different min and mag filters, and depend on the min filter to be correct.	2016-01-25 15:29:18 -08:00
Jason Ekstrand	2434ceabf4	i965/fs: Feel free to spill partial reads/writes Now that we properly handle write-masking, this should be safe.	2016-01-25 15:23:10 -08:00
Jason Ekstrand	9c0109a1f6	i965/fs: Properly write-mask spills For unspills (scratch reads), we can just set WE_all all the time because we always unspill into a new GRF. For spills, we have two options: If the instruction has a 32-bit-per-channel destination and "normal" regioning, then we just do a regular write and it will interleave channels from different control-flow paths properly. If, on the other hand, the the regioning is non-normal, then we have to unspill, run the instruction, and spill afterwards. In this second case, we need to do the spill with we_ALL.	2016-01-25 15:23:10 -08:00
Kristian Høgsberg Kristensen	8e07f7942e	anv: Remove a few finished finishme	2016-01-25 15:16:13 -08:00
Kristian Høgsberg Kristensen	76c096f0e7	anv: Remove stale assert This goes back to when we didn't have the subpass number in the command buffer begin info.	2016-01-25 15:15:59 -08:00
Matt Turner	874ede4983	i965/gen7+: Use NIR for lowering of pack/unpack opcodes.	2016-01-25 14:48:34 -08:00
Matt Turner	5deba3f00a	i965/vec4: Implement nir_op_pack_uvec2_to_uint. And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced by lowering pack[SU]norm4x8 which the vec4 backend does not need.	2016-01-25 14:24:07 -08:00
Matt Turner	8bb22dc351	nir: Add lowering support for unpacking opcodes.	2016-01-25 14:24:07 -08:00
Matt Turner	d7781038f5	nir: Add lowering support for packing opcodes.	2016-01-25 14:24:07 -08:00
Matt Turner	6c1b3bc950	i965/fs: Implement support for extract_word. The vec4 backend will lower it.	2016-01-25 14:24:07 -08:00
Matt Turner	26f0444ead	nir: Add opcodes to extract bytes or words. The uint versions zero extend while the int versions sign extend.	2016-01-25 14:24:07 -08:00
Nanley Chery	2c94f659e8	anv/meta: Fix CopyBuffer when size matches HW limit Perform a copy when the copy_size matches the HW limit (max_copy_size). Otherwise the current behavior is that we fail the following assertion: assert(height < max_surface_dim); because the values are equal.	2016-01-25 12:26:39 -08:00
Kristian Høgsberg Kristensen	c21de2bf04	anv: Don't use uninitialized barycentric_interp_modes If we don't have a fragment shader, wm_prog_data in undefined.	2016-01-25 11:34:32 -08:00
Kristian Høgsberg Kristensen	292031a1a5	anv: Disable fs dispatch for depth/stencil only pipelines Fixes most renderpass bugs.	2016-01-25 11:26:19 -08:00
Matt Turner	26b2cc6f3a	glsl: Remove 2x16 half-precision pack/unpack opcodes. i965/fs was the only consumer, and we're now doing the lowering in NIR.	2016-01-25 11:12:36 -08:00
Matt Turner	24d385f85c	i965/fs: Switch from GLSL IR to NIR for un/packHalf2x16 lowering.	2016-01-25 11:11:56 -08:00
Matt Turner	5eb1145434	nir: Add lowering of nir_op_unpack_half_2x16.	2016-01-25 11:11:56 -08:00
Matt Turner	84166aed92	i965: Make separate nir_options for scalar/vector stages. We'll want to have different lowering options set for scalar/vector stages.	2016-01-25 11:11:26 -08:00
Matt Turner	b6bb3b9bcd	i965: Move brw_compiler_create() to new brw_compiler.c. A future patch will want to use designated initalizers, which aren't available in C++, but this is C.	2016-01-25 11:11:25 -08:00
Matt Turner	b126039784	nir: Make argument order of unop_convert match binop_convert. Strangely the return and parameter types were reversed.	2016-01-25 11:11:08 -08:00
Jason Ekstrand	a804d82ef6	anv/cmd_buffer: Zero out binding tables and samplers in state_reset This fixes a use of an undefined value if the client uses push constants in a stage without ever setting any descriptors on GEN8-9.	2016-01-22 22:57:05 -08:00
Jason Ekstrand	9e0bc29f80	nir/opcodes: Properly flush denormals in fquantize2f16	2016-01-22 22:18:31 -08:00
Jason Ekstrand	89672d81f3	i965/nir: Properly flush denormals in nir_op_fquantize2f16	2016-01-22 22:18:31 -08:00
Jason Ekstrand	2bfb9f29b8	anv/format: Add a helpful comment about format names	2016-01-22 19:14:41 -08:00
Jason Ekstrand	259e1bdf79	anv/formats: Add support for 3 more formats	2016-01-22 19:03:27 -08:00
Jason Ekstrand	0b6c1275d0	anv/pipeline: Add a default L3$ setup	2016-01-22 19:02:55 -08:00
Chad Versace	99a4885328	anv/formats: Rename ambiguous func parameter vkGetPhysicalDeviceImageFormatProperties has multiple 'flags' parameters.	2016-01-22 17:51:24 -08:00
Chad Versace	149d5ba64d	anv/formats: Advertise multisample formats Teach vkGetPhysicalDeviceImageFormatProperties() to advertise multisampled formats.	2016-01-22 17:50:15 -08:00
Chad Versace	d96d78c3b6	anv/image: Drop assertion that samples == 1	2016-01-22 17:19:57 -08:00
Chad Versace	fda074b23f	isl: Fix gen8_choose_msaa_layout() Gen8 requires any Y tiling, not any standard Y tiling.	2016-01-22 17:19:57 -08:00
Chad Versace	2fa1f745ea	isl: Add func isl_tiling_is_any_y()	2016-01-22 17:19:57 -08:00
Chad Versace	fa5f45e8aa	anv/meta: Assert correct sample counts for blit funcs Add assertions to: anv_CmdBlitImage anv_CmdCopyImage anv_CmdCopyImageToBuffer anv_CmdCopyBufferToImage	2016-01-22 17:19:57 -08:00
Chad Versace	dfcb4ee6df	anv: Add anv_image::samples It's set but not yet used.	2016-01-22 17:19:57 -08:00
Chad Versace	1c5d7b38e2	anv: Use isl_device_get_sample_counts() Use it in vkGetPhysicalDeviceProperties.	2016-01-22 17:19:57 -08:00
Chad Versace	14b753f666	isl: Add func isl_device_get_sample_counts()	2016-01-22 17:19:57 -08:00
Nanley Chery	d4de918ad0	gen8/state: Remove SKL special-casing for MinimumArrayElement MinimumArrayElement carries the same meaning for BDW and SKL. Suggested by Jason. No regressions in dEQP-VK.pipeline.image.view_type.cube_array.* Fixes a number of cube tests, including cube_array_base_slice and cube_base_slice tests.	2016-01-22 17:10:14 -08:00
Chad Versace	6a03c69adb	anv/state: Dedupe code for lowering surface format Add helper anv_surface_format().	2016-01-22 16:49:17 -08:00
Francisco Jerez	11d5c1905c	anv/meta: Set sampler type and instruction arrayness consistently in blit shader.	2016-01-22 16:43:18 -08:00
Francisco Jerez	bf151b8892	anv/meta: Fix meta blit fragment shader for 1D arrays.	2016-01-22 16:43:15 -08:00
Jason Ekstrand	53b83899e0	genX/state: Set CubeSurfaceControlMode to OVERRIDE This makes it act like the address mode is set to TEXCOORDMODE_CUBE whenever this sampler is combined with a cube surface. This should be what we need for Vulkan. Interestingly, the PRM contains a programming note for this field that says simply, "This field must be set to CUBECTRLMODE_PROGRAMMED". However, emprical evidence suggests that it does what the PRM says it does and OVERRIDE is just fine.	2016-01-22 16:34:13 -08:00
Jason Ekstrand	35879fe829	gen8/state: Divide depth by 6 for cube maps for GEN8 For Broadwell cube maps, MinimumArrayElement is in terms of 2d slices (a multiple of 6) but Depth is in terms of whole cubes.	2016-01-22 16:14:54 -08:00
Nanley Chery	3cd8c0bb04	gen8_state: Enable all cube faces These fields are ignored for non-cube surfaces. For cube surfaces these fields should be enabled when using TEXCOORDMODE_CLAMP and TEXCOORDMODE_CUBE. TODO: Determine if these are the only two modes used in Vulkan.	2016-01-22 16:12:52 -08:00
Jason Ekstrand	107a109d1c	isl/format_layout: R11G11B10_FLOAT is unsigned	2016-01-22 11:57:49 -08:00
Jason Ekstrand	e5558ffa64	anv/image: Move common code to anv_image.c	2016-01-22 11:57:01 -08:00
Jason Ekstrand	84612f4014	anv/state: Refactor surface state setup into a "fill" function	2016-01-22 11:40:56 -08:00
Francisco Jerez	448285ebf2	anv/state: Add missing clflushes for storage image surface state.	2016-01-22 11:12:09 -08:00
Francisco Jerez	d533c3796d	anv/state: Factor out surface state calculation from genX_image_view_init. Some fields of the surface state template were dependent on the surface type, which is dependent on the usage of the image view, which wasn't known until the bottom of the function after the template had been constructed. This caused failures in all image load/store CTS tests using cubemaps. Refactor the surface state calculation into a function that is called once for each required usage.	2016-01-22 11:12:09 -08:00
Jason Ekstrand	16780632c2	i965/nir: Temporariliy disable mul+add fusion We don't want to do this in the long-run but it's needed for passing the NoContraction tests at the moment. Eventually, we want to plumb this through NIR properly.	2016-01-22 11:10:54 -08:00
Chad Versace	d9abbbe0d8	isl: Fix indentation of isl_format_layout comment	2016-01-22 09:48:11 -08:00
Chad Versace	65f3c420c3	isl/tests: Give tests less cryptic names	2016-01-22 09:46:48 -08:00
Chad Versace	f9d4d09549	isl: Fix isl_surf_get_image_offset_sa for gen4_3d layout Bug found by unit test test_bdw_3d_r8g8b8a8_unorm_256x256x256_levels09_tiley0.	2016-01-22 09:45:22 -08:00
Chad Versace	891ed5ca8c	isl/tests: Add test for bdw 3d surface test_bdw_3d_r8g8b8a8_unorm_256x256x256_levels09_tiley0 Currently fails.	2016-01-22 09:45:21 -08:00
Chad Versace	fbc87ce4be	isl/tests: Remove copy-paste assertion	2016-01-22 07:18:04 -08:00
Chad Versace	63d999b762	isl/tests: Fix build isl_device_init() acquired a new param for bit6 swizzling.	2016-01-22 07:17:57 -08:00
Francisco Jerez	2e54381622	anv/batch_chain: Fix patching up of block pool relocations on Gen8+. Relocations are 64 bits on Gen8+. Most CTS tests that send non-trivial work to the GPU would fail when run from a single deqp-vk invocation because they were effectively relying on reloc presumed offsets to be wrong so the kernel would come and apply relocations correctly.	2016-01-21 16:30:44 -08:00
Jason Ekstrand	13aaf90048	nir/spirv: Ignore cull distance	2016-01-21 16:20:39 -08:00
Jason Ekstrand	13858a1c1a	nir/lower_system_values: Use the correct invication id for CS	2016-01-21 16:20:39 -08:00
Jason Ekstrand	d8c0e0805b	nir/spirv: Properly assign locations to split structures	2016-01-21 16:20:39 -08:00
Jason Ekstrand	514507825c	nir/spirv: Improve handling of variable loads and copies Before we were asuming that a deref would either be something in a block or something that we could pass off to NIR directly. However, it is possible that someone would choose to load/store/copy a split structure all in one go. We need to be able to handle that.	2016-01-21 16:20:39 -08:00
Jason Ekstrand	7e5e64c8a9	nir/spirv: Make vectors a proper array time with an array_element This makes dealing with single-component derefs easier	2016-01-21 16:20:39 -08:00
Jason Ekstrand	a8af0f536c	nir/spirv: Rework access chains a bit to allow for literals This makes them much easier to construct because you can also just specify a literal number and it doesn't have to be a valid SPIR-V id.	2016-01-21 16:20:39 -08:00
Jason Ekstrand	5d9a6fd526	vtn/variables: Compact local loads/stores into one function This is similar to what we did for block loads/stores.	2016-01-21 16:20:39 -08:00
Jason Ekstrand	b298743d7b	nir/spirv: Add an actual variable struct to spirv_to_nir This allows us, among other things, to do structure splitting on-the-fly to more correctly handle input/output structs.	2016-01-21 16:20:39 -08:00
Jason Ekstrand	2892693d56	nir/spirv: Split variable handling out into its own file It's 1300 lines all by itself and it will only grow.	2016-01-21 16:20:39 -08:00
Jason Ekstrand	1112bf633f	nir/spirv: Rework access chains Previously, we were creating nir_deref's immediately. Now, instead, we have an intermediate vtn_access_chain structure. While a little more awkward initially, this will allow us to more easily do structure splitting on-the-fly.	2016-01-21 16:18:37 -08:00
Kenneth Graunke	824f776355	nir/spirv: Implement ModfStruct opcode.	2016-01-21 14:57:47 -08:00
Kenneth Graunke	f89d5cb807	nir/spirv: Delete stray fmod remnants. Jason left these stray code fragments in `22804de110`.	2016-01-21 14:54:20 -08:00
Kristian Høgsberg Kristensen	ac60e98a58	vk: Do render cache flush for GEN8+ This is needed for SKL as well.	2016-01-21 14:18:52 -08:00
Kristian Høgsberg Kristensen	9eab8fc683	vk: Emit surface state base address before renderpass If we're continuing a render pass, make sure we don't emit the depth and stencil buffer addresses before we set the state base addresses. Fixes crucible func.cmd-buffer.small-secondaries	2016-01-21 14:18:52 -08:00
Kristian Høgsberg Kristensen	c5490d0277	vk: Fix indirect push constants This currently sets the base and size of all push constants to the entire push constant block. The idea is that we'll use the base and size to eventually optimize the amount we actually push, but for now we don't do that.	2016-01-21 11:10:11 -08:00
Kristian Høgsberg Kristensen	83c86e09a8	Merge remote-tracking branch 'jekstrand/wip/i965-uniforms' into vulkan	2016-01-21 11:09:58 -08:00
Jordan Justen	b1a7a27d60	nir/spirv: Handle compute shared atomics Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	a7e5b683ca	nir/spirv: Support workgroup (shared) variable translation Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	bc035db3c8	anv/gen8: Set SLM size in interface descriptor Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	819cb69434	anv/gen8+9: Invalidate color calc state when switching to the GPGPU pipeline Port `044acb9256` to anv. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	19830031cb	anv/gen8: Enable SLM in L3 cache control register Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	97b09a9268	anv/pipeline: Set size of shared variables in prog_data Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	86daceb7f2	i965/nir: Lower nir compute shader shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	ca55817fa1	nir: Lower shared var atomics during nir_lower_io Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	36157cd5ea	nir: Add support for lowering load/stores of shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	7a9a54b5c8	nir: Add atomic operations on variables This allows us to first generate atomic operations for shared variables using these opcodes, and then later we can lower those to the shared atomics intrinsics with nir_lower_io. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	10db985fa0	nir: Add compute shader shared variable storage class Previously we were receiving shared variable accesses via a lowered intrinsic function from glsl. This change allows us to send in variables instead. For example, when converting from SPIR-V. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	65a5407931	nir/print: Add space after shader_storage var mode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	9f4a72c9e3	i965/fs/nir: Move shared variable load/store to nir_emit_cs_intrinsic Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Chad Versace	5ce5a7d021	anv/image: Stop including gen8_pack.h in common file	2016-01-20 15:42:17 -08:00
Chad Versace	8ab527de03	isl: Add a README Most of the file-level comment in isl.h is moved to the README.	2016-01-20 15:24:40 -08:00
Kristian Høgsberg Kristensen	7b7a7c2bfc	vk: Make maxSamplerAllocationCount more reasonable We can't allocate 4 billion samplers. Let's go with 64k.	2016-01-20 14:36:52 -08:00
Kristian Høgsberg Kristensen	8ef002dd7a	vk/tests: Add stub for anv_gem_get_bit6_swizzle()	2016-01-20 13:47:40 -08:00
Kristian Høgsberg Kristensen	420e8664cb	vk/tests: Add isl include path	2016-01-20 13:47:40 -08:00
Kenneth Graunke	b76e4458f9	nir/spirv/glsl450: Use fabs not iabs in ldexp. This was just wrong.	2016-01-20 12:18:02 -08:00
Kristian Høgsberg Kristensen	947ebd9c71	isl: Add ish.h to libsil_la_SOURCES	2016-01-20 12:03:46 -08:00
Jason Ekstrand	21b2d87408	nir/spirv/glsl450: Implement FrexpStruct	2016-01-20 11:36:41 -08:00
Jason Ekstrand	c7896d1868	spirv/nir/glsl450: Use vtn_create_ssa_value to create SSA values	2016-01-20 11:36:26 -08:00
Jason Ekstrand	e45748bade	anv/device: Default to scalar GS on BDW+	2016-01-20 11:16:44 -08:00
Jason Ekstrand	34f9a5f301	nir/spirv: Pull texture dimensionality out of the image when available	2016-01-20 11:11:30 -08:00
Jason Ekstrand	59ef7c6507	anv/meta: fix UpdateBuffer in the case where we do multiple updates	2016-01-20 07:56:48 -08:00
Jason Ekstrand	a0516cfbac	anv/meta: Fix a finishme	2016-01-20 07:33:41 -08:00
Jason Ekstrand	c7203aa621	nir/spirv: Move OpPhi handling to vtn_cfg.c Phi handling is somewhat intrinsically tied to the CFG. Moving it here makes it a bit easier to handle that. In particular, we can now do SSA repair after we've done the phi node second-pass. This fixes 6 CTS tests.	2016-01-19 19:00:00 -08:00
Jason Ekstrand	891564adb9	nir/spirv: Handle OpLine and OpNoLine in foreach_instruction This way we don't have to explicitly handle them everywhere.	2016-01-19 19:00:00 -08:00
Kenneth Graunke	e79f8a4926	nir: Lower ldexp to arithmetic. This is a port of Matt's GLSL IR lowering pass to NIR. It's required because we translate SPIR-V directly to NIR, bypassing GLSL IR. I haven't introduced a lower_ldexp flag, as I believe all current NIR consumers would set the flag. i965 wants this, vc4 doesn't implement this feature, and st_glsl_to_tgsi currently lowers ldexp unconditionally anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-19 18:10:30 -08:00
Kenneth Graunke	b3cc10f3b2	nir: Let nir_opt_algebraic rules contain unsigned constants > INT_MAX. struct.pack('i', val) interprets `val` as a signed integer, and dies if `val` > INT_MAX. For larger constants, we need to use 'I' which interprets it as an unsigned value. This patch makes us use 'I' for all values >= 0, and 'i' for negative values. This should work in all cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-19 18:10:30 -08:00
Jason Ekstrand	eb2a119da2	anv/meta: Implement UpdateBuffer	2016-01-19 16:53:35 -08:00
Jason Ekstrand	0ae1bd321e	anv/meta: Implement CmdFillBuffer	2016-01-19 16:53:35 -08:00
Jason Ekstrand	46eef31311	anv/meta_clear: Call emit_clear directly in ClearImage Using the load op means that we end up with recursive meta. We shouldn't be doing that.	2016-01-19 16:53:35 -08:00
Jason Ekstrand	6325a75011	anv/meta_clear: Do save/restore in actual entry points	2016-01-19 16:53:35 -08:00
Jason Ekstrand	56dbf13045	anv: Add support for VK_WHOLE_SIZE several places	2016-01-19 16:53:35 -08:00
Kenneth Graunke	549be68258	nir/spirv/glsl450: Implement Frexp.	2016-01-19 16:46:03 -08:00
Kenneth Graunke	68c9ca1a94	nir/spirv/glsl450: Blindly implement Atan2. This is untested and probably broken. We already passed the atan2 CTS tests before implementing this opcode. Presumably, glslang or something was giving us a plain Atan opcode instead of Atan2. I don't know why.	2016-01-19 16:14:05 -08:00
Kenneth Graunke	2ab3efa0ad	nir/spirv/glsl450: Implement Atan.	2016-01-19 16:14:05 -08:00
Kenneth Graunke	bc9d9bc2e3	nir/spirv/glsl450: Implement Asin and Acos.	2016-01-19 16:14:05 -08:00
Jason Ekstrand	5e57a87dcf	anv/pipeline: Fix point size	2016-01-19 12:03:13 -08:00
Daniel Stone	f9ca780ea4	anv/wsi: Mark Wayland buffers as busy We were diligently setting Wayland buffers as non-busy, but nowhere in the code did we set them to busy when submitted to the server. This meant that acquire_next_image would only ever find the same buffer in a loop, over and over. Signed-off-by: Daniel Stone <daniels@collabora.com>	2016-01-19 16:54:55 +00:00
Daniel Stone	ba5ef49dcb	anv/wsi: Avoid stuck Wayland connection In acquire_next_image, we are waiting for a wl_buffer::release to arrive and release one of the buffers in our swapchain. Most compositors don't explicitly flush release events, so we may need to perform a roundtrip instead, to ensure the event arrives. Signed-off-by: Daniel Stone <daniels@collabora.com>	2016-01-19 16:54:55 +00:00
Jason Ekstrand	3276610ea6	getX/state: Set LOD pre-clamp to OpenGL mode This gets us another couple hundred sampler tests	2016-01-18 17:51:35 -08:00
Jason Ekstrand	580b2e85e4	isl/device: Add a flag for bit 6 swizzling	2016-01-18 17:21:05 -08:00
Jason Ekstrand	587842a0ca	anv/gem: Add a helper for getting bit6 swizzling information	2016-01-18 17:21:05 -08:00
Jason Ekstrand	c2a6f4302e	nir/spirv: Patch through image qualifiers	2016-01-18 17:21:05 -08:00
Jason Ekstrand	56c8a5f2b8	nir/spirv: Implement ImageQuerySize for storage iamges SPIR-V only has one ImageQuerySize opcode that has to work for both textures and storage images. Therefore, we have to special-case that one a bit and look at the type of the incoming image handle.	2016-01-18 17:21:05 -08:00
Jason Ekstrand	bb8cadd169	nir/spirv: Insert movs around image intrinsics Image intrinsics always take a vec4 coordinate and always return a vec4. This simplifies the intrinsics a but but also means that they don't actually match the incomming SPIR-V. In order to compensate for this, we add swizzling movs for both source and destination to get the right number of components.	2016-01-18 17:21:05 -08:00
Jason Ekstrand	6f956b0b22	anv/meta: Improve meta clear cleanup a bit	2016-01-18 14:07:46 -08:00
Jason Ekstrand	45d17fcf9b	anv: Misc allocation scope fixes	2016-01-18 14:04:13 -08:00
Jason Ekstrand	378af64e30	anv/meta: Add a meta allocator that uses SCOPE_DEVICE The Vulkan spec requires all allocations that happen for device creation to happen with SCOPE_DEVICE. Since meta calls into other things that allocate memory, the easiest way to do this is with an allocator.	2016-01-18 14:03:24 -08:00
Jason Ekstrand	3dfa6a881c	anv/meta: Initialize a handle to null	2016-01-18 13:05:02 -08:00
Jason Ekstrand	d49298c702	gen8: Fix border color The border color packet is specified as a 64-byte aligned address relative to dynamic state base address. The way the packing functions are currently set up, we need to provide it with (offset >> 6) because it just shoves the bits in where the PRM says they go and isn't really aware that it's an address.	2016-01-18 12:16:31 -08:00
Jason Ekstrand	bfcc744892	genX/pack: Add a __gen_fixed helper and use it for TextureLODBias The __gen_fixed helper properly clamps the value and also handles negative values correctly. Eventually, we need to make the scripts generate this and use it for more things.	2016-01-18 11:35:04 -08:00
Jason Ekstrand	5a67df2546	anv/pack: Make TextureLODBias a proper 4.8 float XXX: We need to update the generators so this doesn't get stompped.	2016-01-18 10:36:53 -08:00
Jason Ekstrand	15e6af0708	nir/spirv: Handle if's where the merge is also a break or continue	2016-01-18 10:10:47 -08:00
Jason Ekstrand	14ebd0fdd7	nir/spirv: Hanle continues that use SSA values from the loop body Instead of emitting the continue before the loop body we emit it afterwards. Then, once we've finished with the entire function, we run nir_repair_ssa to add whatever phi nodes are needed.	2016-01-18 09:43:12 -08:00
Jason Ekstrand	61ba97522e	nir/lower_returns: Repair SSA after doing return lowering	2016-01-18 09:43:12 -08:00
Jason Ekstrand	b11825590d	nir: Add a pass to repair SSA form	2016-01-18 09:43:12 -08:00
Jason Ekstrand	a7a5e8a2de	nir/vars_to_ssa: Use the new nir_phi_builder helper The efficiency should be approximately the same. We do a little more work per phi node because we have to sort the predecessors. However, we no longer have to walk the blocks a second time to pop things off the stack. The bigger advantage, however, is that we can now re-use the phi placement and per-block SSA value tracking in other passes.	2016-01-18 09:18:42 -08:00
Jason Ekstrand	8aab4a7bd2	nir: Add a phi node placement helper Right now, we have phi placement code in two places and there are other places where it would be nice to be able to do this analysis. Instead of repeating it all over the place, this commit adds a helper for placing all of the needed phi nodes for a value.	2016-01-18 09:18:42 -08:00
Jason Ekstrand	b1f1200e80	util/bitset: Allow iterating over const bitsets	2016-01-18 09:18:42 -08:00
Jason Ekstrand	f509a89082	nir/lower_system_values: Lower vertexID to id+base if needed	2016-01-15 16:15:50 -08:00
Jason Ekstrand	6b64dddd71	anv/batch_chain: Remove padding from the BO before emitting BUFFER_END	2016-01-15 15:59:58 -08:00
Jason Ekstrand	67bf74f020	anv/batch_chain: Don't call current_batch_bo() again We call it once at the top of the function and then hold on to the pointer. It shouldn't have changed, so there's no reason to query for it again.	2016-01-15 15:49:32 -08:00
Jason Ekstrand	117cac75d0	nir/spirv: Stop trusting the SPIR-V for the number of texture coordinates	2016-01-15 11:13:51 -08:00
Chad Versace	0e420cb67f	anv: Populate SURFACE_STATE more safely genX_image_view_init allocates up to 3 separate SURFACE_STATE structures, and populates each from a single template. Stop mutating the template between each final SURFACE_STATE.	2016-01-15 11:00:22 -08:00
Chad Versace	eab6212efd	anv/meta: Stop leaking renderpass and framebuffer	2016-01-15 10:14:07 -08:00
Chad Versace	482a1f5eab	anv/meta: Reuse code for vkCmdClear{Color,DepthStencil}Image The two function bodies were very similar. Move common code to anv_cmd_clear_image(). Fixes all 'dEQP-VK.renderpass.formats.*' on Skylake.	2016-01-15 07:46:10 -08:00
Chad Versace	1afe33f8b3	anv/gen8: Fix SF_CLIP_VIEWPORT's Z elements SF_CLIP_VIEWPORT does not clamp Z values. It only scales and shifts them. Clamping to VkViewport::minDepth,maxDepth is instead handled by CC_VIEWPORT. Fixes dEQP-VK.renderpass.simple.depth on Broadwell.	2016-01-14 22:53:05 -08:00
Chad Versace	842b424d3b	anv/meta: Implement vkCmdClearDepthStencilImage	2016-01-14 22:53:05 -08:00
Chad Versace	e4b17a2e1a	anv/meta: Implement vkCmdClearAttachments	2016-01-14 22:53:05 -08:00
Chad Versace	0038ae2e4a	anv/meta: Add VkClearRect param to emit_clear() Prepares for vkCmdClearAttachments.	2016-01-14 22:53:05 -08:00
Chad Versace	11f5433715	anv: Distinguish between subpass setup and subpass start vkCmdBeginRenderPass, vkCmdNextSubpass, and vkBeginCommandBuffer with VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, all setup the command buffer for recording commands for some subpass. But only the first two, vkCmdBeginRenderPass and vkCmdNextSubpass, can start a subpass. Therefore, calling anv_cmd_buffer_begin_subpass() inside vkCmdBeginCommandBuffer is misleading. Clarify its purpose by renaming it to anv_cmd_buffer_set_subpass() and adding comments.	2016-01-14 22:53:05 -08:00
Chad Versace	deb8dd89b5	anv: Emit load clears at start of each subpass This should improve cache residency for render targets. Pre-patch, vkCmdBeginRenderPass emitted all the meta clears for VK_ATTACHMENT_LOAD_OP_CLEAR before any subpass began. Post-patch, vCmdBeginRenderPass and vkCmdNextSubpass emit only the clears needed for that current subpass.	2016-01-14 22:53:05 -08:00
Chad Versace	0679bef49f	anv/meta: Create 8 pipelines for color clears This prepares for moving the clear ops from the start of the render pass into each subpass. Pipeline N will be used to clear color attachment N of the current subpass. Currently meta color clears still create a throwaway subpass with exactly one attachment, so currently only pipeline 0 is used. This is an ugly hack to workaround the compiler's current inability to dynamically set the render target index in the render target write message.	2016-01-14 22:53:05 -08:00
Chad Versace	2997b0da4a	anv: Allow override of pipeline color attachment count Add anv_graphics_pipeline_create_info::color_attachment_count. If non-negative, then it overrides the color attachment count in the pipeline's subpass. Useful for meta. (All the hacks for meta!)	2016-01-14 22:53:05 -08:00
Chad Versace	13610c03a7	anv/meta: Name the nir shaders The names appear in debug output.	2016-01-14 22:53:05 -08:00
Chad Versace	6a1a760e3c	anv: Move MAX_* defs to top of anv_private.h Because I need to use MAX_RTS in struct anv_meta_state.	2016-01-14 22:53:05 -08:00
Chad Versace	4c2bafb9bf	anv: Define zero() macro zero(x) memsets x to zero. Eliminates bugs due to errors in memset's size param.	2016-01-14 22:53:05 -08:00
Chad Versace	f2700d665c	anv/meta: Rename emit_load_*_clear funcs The functions will soon handle clears unrelated to VK_ATTACHMENT_LOAD_OP_CLEAR, namely vkCmdClearAttachments. So remove "load" from their name: emit_load_color_clear -> emit_color_clear emit_load_depthstencil_clear -> emit_depthstencil_clear	2016-01-14 22:53:05 -08:00
Chad Versace	356f952f87	anv/meta: Use anv_cmd_state::attachments for clears Rewrite anv_cmd_buffer_clear_attachments, which emits the top-of-pass clears, to use the data provided in anv_cmd_state::attachments. This prepares for deferring each attachment clear to the first subpass that uses the attachment.	2016-01-14 22:53:05 -08:00
Chad Versace	a4b045ca44	anv: Add anv_cmd_state::attachments This array contains attachment state when recording a renderpass instance. It's populated on each call to anv_cmd_buffer_set_pass. The data is currently set but unused. We'll use it later to defer each attachment clear to the subpass that first uses the attachment.	2016-01-14 22:53:05 -08:00
Jason Ekstrand	5d1c2736b6	i965/fs/generator: Change a comment as per jordan's suggestion	2016-01-14 22:03:15 -08:00
Jason Ekstrand	6be517b20e	i965/fs: Always set hannel 2 of texture headers in some stages	2016-01-14 20:42:47 -08:00
Jason Ekstrand	e1d13cd058	i965/fs/generator: Take an actual shader stage rather than a string	2016-01-14 20:27:56 -08:00
Jason Ekstrand	47af950df5	anv/apply_pipeline_layout: Stomp texture array size to 1	2016-01-14 18:58:25 -08:00
Jason Ekstrand	6483d3f8fe	nir/spirv: Fix texture return types We were just hard-coding everything to a vec4. This meant we weren't handling shadow samplers at all and integer things were getting the wrong return type.	2016-01-14 18:48:57 -08:00
Kristian Høgsberg Kristensen	2eb52198ff	vk: Fix struct field indentation	2016-01-14 15:18:40 -08:00
Chad Versace	5dea9d0039	anv: Document anv_cmd_state::current_pipeline It's the value of PIPELINE_SELECT.PipelineSelection.	2016-01-14 13:18:40 -08:00
Chad Versace	ed33ccde63	anv: Make vkBeginCommandBuffer reset the command buffer If its the command buffer's first call to vkBeginCommandBuffer, we must initialize the command buffer's state. Otherwise, we must reset its state. In both cases, let's use anv_ResetCommandBuffer. From the Vulkan 1.0 spec: If a command buffer is in the executable state and the command buffer was allocated from a command pool with the VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT flag set, then vkBeginCommandBuffer implicitly resets the command buffer, behaving as if vkResetCommandBuffer had been called with VK_COMMAND_BUFFER_RESET_RELEASE_RESOURCES_BIT not set. It then puts the command buffer in the recording state.	2016-01-14 13:14:40 -08:00
Chad Versace	ea20389320	anv: Add FIXME for vkResetCommandPool vkResetCommandPool currently destroys its command buffers. The Vulkan 1.0 spec requires that it only reset them: Resetting a command pool recycles all of the resources from all of the command buffers allocated from the command pool back to the command pool. All command buffers that have been allocated from the command pool are put in the initial state.	2016-01-14 13:14:40 -08:00
Chad Versace	20fd816b6b	anv: Remove duplicate func prototype anv_private.h declared anv_cmd_buffer_begin_subpass twice.	2016-01-14 13:14:40 -08:00
Chad Versace	0415dfcfe7	anv/meta: Add FINISHME for clearing multi-layer framebuffers	2016-01-14 13:14:40 -08:00
Jason Ekstrand	32f8bcb84f	i965/vec4: Use UW type for multiply into accumulator on GEN8+ BDW adds the following restriction: "When multiplying DW x DW, the dst cannot be accumulator."	2016-01-14 12:04:25 -08:00
Jason Ekstrand	45349acad0	Merge remote-tracking branch 'mesa-public/master' into vulkan This fixes the bitfieldextract and bitfieldinsert CTS tests	2016-01-14 11:36:27 -08:00
Jason Ekstrand	f46f4e4886	nir/spirv: Add initial support for Vertex/Instance index	2016-01-14 09:12:32 -08:00
Jason Ekstrand	3d0fac7aca	vulkan.h: Pull in 1.0.1 header	2016-01-14 08:37:54 -08:00
Jason Ekstrand	24a6fcba77	vulkan-1.0.0: Bump the version to 1.0.0	2016-01-14 08:26:37 -08:00
Jason Ekstrand	c310fb032d	vulkan-1.0.0: Rework memory barriers	2016-01-14 08:09:39 -08:00
Jason Ekstrand	b14a78cfb8	vulkan-1.0.0: No-op WSI changes	2016-01-14 08:02:44 -08:00
Jason Ekstrand	6d3322d0e5	vulkan-1.0.0: Make extents unsigned	2016-01-14 08:00:18 -08:00
Jason Ekstrand	b57c72d964	vulkan-1.0.0: Rework blits to use four offsets	2016-01-14 07:59:37 -08:00
Jason Ekstrand	f6cae99294	vulkan-1.0.0: Split out command buffer inheritance info	2016-01-14 07:45:15 -08:00
Jason Ekstrand	f99f847412	vulkan-1.0.0: Re-order some structs in the header	2016-01-14 07:43:05 -08:00
Jason Ekstrand	aab9517f3d	vulkan-1.0.0: Misc. field and argument renames	2016-01-14 07:41:45 -08:00
Jason Ekstrand	d877095e66	vulkan-1.0.0: Get rid of MIPMAP_MODE_BASE	2016-01-14 07:32:16 -08:00
Jason Ekstrand	7b81637762	vulkan-1.0.0: Convert pPreserveAttachments to a uint32_t	2016-01-14 07:30:46 -08:00
Jason Ekstrand	802f00219a	anv/device: Update features and limits	2016-01-14 07:30:46 -08:00
Jason Ekstrand	08735ba91c	anv/cmd_buffer: Fix setting of viewport/scissor count	2016-01-14 07:30:46 -08:00
Jason Ekstrand	ed4fe3e9ba	anv/state: Respect SamplerCreateInfo.anisotropyEnable	2016-01-14 07:30:46 -08:00
Jason Ekstrand	8a81d136f8	anv/image: Fill out VkSubresourceLayout.arrayPitch	2016-01-14 07:30:46 -08:00
BogDan Vatra	102c74277f	WIP: Partially upgrade to vulkan v0.221.0 TODO, make use of: - VkPhysicalDeviceFeatures.drawIndirectFirstInstance, - VkPhysicalDeviceFeatures.inheritedQueries - VkPhysicalDeviceLimits.timestampComputeAndGraphics - VkSubmitInfo.pWaitDstStageMask - VkSubresourceLayout.arrayPitch - VkSamplerCreateInfo.anisotropyEnable	2016-01-14 07:30:46 -08:00
Jordan Justen	8ce2b0e140	nir/spirv: Add support for ArrayLength op Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-13 23:34:45 -08:00
Jason Ekstrand	4507d8a57a	nir/spirv/alu: Properly implement mod/rem	2016-01-13 16:53:02 -08:00
Jason Ekstrand	7d5ae2d34b	i965: Implement nir_op_irem and nir_op_srem	2016-01-13 16:53:02 -08:00
Jason Ekstrand	cac99fffdb	nir: Add more modulus and remainder opcodes SPIR-V makes a distinction between "modulus" and "remainder" for both floating-point and signed integer variants. The difference is primarily one of which source they take their sign from. The "remainder" opcode for integers is equivalent to the C/C++ "%" operation while the "modulus" opcode is more mathematically correct (at least for an unsigned divisor). This commit adds corresponding opcodes to NIR.	2016-01-13 15:18:36 -08:00
Jason Ekstrand	0079523a0d	nir/spirv: Add support for OpSpecConstantOp	2016-01-13 15:18:36 -08:00
Jason Ekstrand	8c408b9b81	nir/spirv/alu: Factor out the opcode table	2016-01-13 15:18:36 -08:00
Jason Ekstrand	9b7e08118b	anv/pipeline: Pass through specialization constants	2016-01-13 15:18:36 -08:00
Jason Ekstrand	c95c3b2c21	nir/spirv: Add initial support for specialization constants	2016-01-13 15:18:36 -08:00
Jason Ekstrand	610aa00cdf	nir/spirv: Add support for OpQuantize	2016-01-12 15:36:38 -08:00
Jason Ekstrand	282a837317	i965: Implement nir_op_fquantize2f16	2016-01-12 15:35:00 -08:00
Jason Ekstrand	15a56459d7	nir: Add a fquantize2f16 opcode This opcode simply takes a 32-bit floating-point value and reduces its effective precision to 16 bits.	2016-01-12 15:33:02 -08:00
Jason Ekstrand	aee970c844	anv/device: Bump the max program size again No one will ever need more than 128K, right?	2016-01-12 13:49:05 -08:00
Kristian Høgsberg Kristensen	d7a193327b	vk: Implement workaround for occlusion queries We have an issue with occlusion queries (PIPE_CONTROL depth writes) after using the pipeline with the VS disabled. We work around it by using a depth cache flush PIPE_CONTROL before doing a depth write. Fixes dEQP-VK.query_pool.*	2016-01-12 11:50:36 -08:00
Jason Ekstrand	6fc278ae4f	anv/UpdateDescriptorSets: Respect write.dstArrayElement	2016-01-12 11:45:12 -08:00
Kristian Høgsberg Kristensen	af422fe9b3	Merge ../mesa into vulkan Merge master again to get the brw_device_info with the correct slice counts for KBL.	2016-01-12 10:54:26 -08:00
Kristian Høgsberg Kristensen	7df20f0c14	vk: Support SpvBuiltInViewportIndex	2016-01-12 10:53:59 -08:00
Kristian Høgsberg Kristensen	2b4bacb84b	vk: Use the correct stride for CC_VIEWPORT structs	2016-01-12 10:53:59 -08:00
Jason Ekstrand	62e56492c3	nir/spirv: Allow non-block variables with interface types in lists The original objective was to disallow UBO and SSBO variables from the variable lists. This was accidentally broken in `b208620fd` when fixing some other interface issues.	2016-01-12 01:32:19 -08:00
Jason Ekstrand	4141d13de5	nir/spirv: Handle matrix decorations on arrays of matrices Connor's original shallow-copy plan works great except that a couple of the decorations apply to a matrix which may be some levels down in an array. We weren't properly unpacking that. This fixes most of the remaining SSBO and UBO layout tests.	2016-01-12 01:04:44 -08:00
Jason Ekstrand	b208620fd2	nir/spirv: Allow creating local/global variables from interface types Not sure if this is actually allowed, but it's not that hard to just strip the interface information from the type.	2016-01-11 17:45:54 -08:00
Jason Ekstrand	350bbd3d15	nir/spirv: Allow base derefs in get_vulkan_resource_index	2016-01-11 17:45:24 -08:00
Jason Ekstrand	1c5393d57d	nir/spirv: Allow OpBranchConditional without a merge This can happen if you have a predicated break/continue.	2016-01-11 17:03:52 -08:00
Jason Ekstrand	24523e98a4	nir/spirv/cfg: Allow breaking from the continue block	2016-01-11 17:03:16 -08:00
Jason Ekstrand	c381906bbd	nir/spirv: Stop wrapping carry/borrow in b2i The upstream versions now return an integer like GLSL/SPIR-V want.	2016-01-11 17:02:30 -08:00
Jason Ekstrand	dee09d7393	nir/spirv: Better handle OpCopyMemory	2016-01-11 16:29:38 -08:00
Jason Ekstrand	1ca97cefb0	nir/spirv: Add no-op support for OpSourceContinued	2016-01-11 16:06:11 -08:00
Jason Ekstrand	bb5882e6af	nir/spirv/cfg: Handle unreachable instructions	2016-01-11 15:35:15 -08:00
Jason Ekstrand	fc3f659aa9	nir/vars_to_ssa: Add phi sources for unreachable predecessors It is possible to end up with unreachable blocks if, for instance, you have an "if (...) { break; } else { continue; } unreachable()". In this case, the unreachable block does not show up in the dominance tree so it never gets visited. Instead, we go and visit all of those in follow-on pass.	2016-01-11 15:33:44 -08:00
Jason Ekstrand	c974b94578	nir/spirv: Properly handle OpConstantNull	2016-01-11 14:30:46 -08:00
Jason Ekstrand	96683065f2	nir/spirv: Assert that matrix types are valid	2016-01-11 14:30:46 -08:00
Jason Ekstrand	d032ede26f	nir/types: Add an is_error helper	2016-01-11 14:30:46 -08:00
Jason Ekstrand	17cfafd83a	nir/spirv: Handle OpNoLine	2016-01-11 14:30:46 -08:00
Chad Versace	52d4af6a3c	anv/gen7: Remove unheeded helper begin_render_pass() The helper didn't help much. It looks like a leftover from past code-reuse. Now it's called from exactly one location, gen7_CmdBeginRenderPass(). So fold it into its caller.	2016-01-11 14:08:30 -08:00
Jason Ekstrand	790565b06e	anv/pipeline: Handle output lowering in anv_pipeline instead of spirv_to_nir While we're at it, we delete any unused variables. This allows us to prune variables that are not used in the current stage from the shader.	2016-01-11 11:06:06 -08:00
Jason Ekstrand	b8ec48ee76	anv/pipeline: Only delete functions for SPIR-V shaders We can assume that direct NIR shaders only have one entrypoint	2016-01-11 11:06:06 -08:00
Jason Ekstrand	30883adfb8	nir/spirv: Get rid of a bunch of stage asserts Since we may have multiple entrypoints from different stages, we don't know what stage we are actually in so these asserts are invalid.	2016-01-11 11:06:06 -08:00
Jason Ekstrand	9f4ba499d1	nir/spirv: Take an entrypoint stage as well as a name	2016-01-11 11:06:06 -08:00
Jason Ekstrand	83bf1f752d	nir/dead_variables: Add a a mode parameter This allows dead_variables to be used on any type of variable.	2016-01-11 11:06:06 -08:00
Kristian Høgsberg Kristensen	a9c0e8f00f	vk: Handle uninitialized FS inputs and gl_PrimitiveID These show up as varying_to_slot[attr] == -1. Instead of storing -1 - 2 in swiz.Attribute[input_index].SourceAttribute, handle it correctly.	2016-01-09 01:03:20 -08:00
Kristian Høgsberg Kristensen	b538ec5409	vk: Support reseting timestamp query pools	2016-01-09 00:51:50 -08:00
Kristian Høgsberg Kristensen	925ad84700	vk: Advertise number of timestamp bits We have 36 bits.	2016-01-09 00:51:14 -08:00
Kristian Høgsberg Kristensen	dae800daa8	vk: Expose correct timestampPeriod for SKL Skylake uses 83.333ms per tick.	2016-01-09 00:50:04 -08:00
Kristian Høgsberg Kristensen	ec8e261208	vk: Mark VkEvent and VkSemaphore as done	2016-01-09 00:48:41 -08:00
Kristian Høgsberg Kristensen	bbb2a85c81	vk: Assert on use of uninitialized surface state This exposes a case where we want to anv_CmdCopyBufferToImage() on an image that wasn't created with VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT and end up using uninitialized color_rt_surface_state from the meta image view.	2016-01-08 23:51:11 -08:00
Kristian Høgsberg Kristensen	a8cdef3dce	vk: Only begin subpass if we're continuing a render pass If VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT is not set in pBeginInfo->flags, we don't have a render pass or framebuffer. Change the condition that guard looking up render pass and framebuffer to test for VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT instead of VK_COMMAND_BUFFER_LEVEL_SECONDARY. Fixes all remaining crashes in dEQP-VK.api.command_buffers.*.	2016-01-08 23:02:46 -08:00
Kristian Høgsberg Kristensen	7c5e1fd998	vk: Remove unsupported warnings for Skylake and Broxton These are working as well as Broadwell and Cherryiew. The recent merge from mesa master brings in Kabylake device info and that should be all we need to enable that.	2016-01-08 22:29:06 -08:00
Kristian Høgsberg Kristensen	f0993f81c7	Merge ../mesa into vulkan	2016-01-08 22:16:43 -08:00
Jason Ekstrand	cfdc955fd5	anv/reloc_list: Make valgrind explicitly check relocation data	2016-01-08 16:44:54 -08:00
Jason Ekstrand	7a1c4a0ccc	nir/spirv: Add matrix determinants and inverses	2016-01-08 16:02:30 -08:00
Jordan Justen	c7f6e42a7d	anv: Increate dynamic pool block size from 2k to 16k This is needed because compute push constant data is replicated per invocation. For gen7, this can be up to 64. With a push constant data max of 128 bytes, this is 8k of data. We need additional space for local-id payloads, so we are going with 16k for now. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-08 13:03:30 -08:00
Jason Ekstrand	4e15d26e47	nir/spirv: Fix a small bug in row-major matrix loading	2016-01-08 12:27:25 -08:00
Jason Ekstrand	fe2f44f2a4	nir/spirv: Use create_ssa_value for block_load_store	2016-01-08 11:50:34 -08:00
Jason Ekstrand	8b9dfb4b6d	nir/spirv: Add real support for outer products	2016-01-08 11:38:59 -08:00
Jason Ekstrand	927ef0ea4e	nir/spirv: Add support for add, subtract, and negate on matrices	2016-01-08 11:26:43 -08:00
Jason Ekstrand	393562f47b	nir/spirv: Split ALU operations out into their own file	2016-01-08 11:26:43 -08:00
Jason Ekstrand	72bff62e7f	nir/spirv: Add support for SSBO atomics	2016-01-07 22:13:46 -08:00
Jason Ekstrand	fe57ad62a6	nir/spirv: Rework UBOs and SSBOs This completely reworks all block load/store operations. In particular, it should get row-major matrices working.	2016-01-07 22:13:46 -08:00
Chad Versace	1818463733	anv/gen9: Fix cube surface state For gen9 SURFTYPE_CUBE, the RENDER_SURFACE_STATE's Depth, MinimumArrayElement, and RenderTargetViewExtent is in units of full cubes and so must be divided by 6. Fixes 'dEQP-VK.pipeline.image.view_type.cube_array.cube_array.'. Now all of 'dEQP-VK.pipeline.image.' passes.	2016-01-07 17:20:25 -08:00
Chad Versace	24d82a3f79	anv/gen8: Refactor genX_image_view_init() Drop the temporary variables for RENDER_SURFACE_STATE's Depth and RenderTargetViewExtent. Instead, assign them in-place. This simplifies the next commit, which fixes gen9 cube surfaces.	2016-01-07 17:20:25 -08:00
Kristian Høgsberg Kristensen	1b1dca75a4	vk: Make sure we emit binding table pointers after push constants SKL needs this to make sure we flush the push constants. It gets a little tricky, since we also need to emit binding tables before push constants, since that may affect the push constants (dynamic buffer offsets and storage image parameters). This patch splits emitting binding tables from emitting the pointers so that we can emit push constants after binding tables but before emitting binding table pointers.	2016-01-07 16:31:57 -08:00
Kristian Høgsberg Kristensen	a18b5e642c	vk: Implement VK_QUERY_RESULT_WITH_AVAILABILITY_BIT	2016-01-07 16:31:57 -08:00
Kristian Høgsberg Kristensen	bbf3fc815b	vk: Add missing DepthStallEnable to OQ pipe control	2016-01-07 16:31:57 -08:00
Kristian Høgsberg Kristensen	067dbd7a17	vk: Issue PIPELINE_SELECT before setting up render pass We need to make sure we're selected the 3D pipeline before we start setting up depth and stencil buffers.	2016-01-07 16:31:57 -08:00
Jordan Justen	d24e88b98e	anv/gen7: Setup state to enable barrier() function Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-07 17:11:46 -08:00
Jordan Justen	36a2304686	anv/gen8: Setup state to enable barrier() function Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-07 17:11:46 -08:00
Chad Versace	4c7f4c25d0	anv/meta: Fix hardcoded format size in anv_CmdCopy* When looping through VkBufferImageCopy regions, for each region we incremented the offset into the VkBuffer assuming the format size was 4. Fixes CTS tests dEQP-VK.pipeline.image.view_type.cube_array.3d.* on Skylake.	2016-01-07 13:56:58 -08:00
Chad Versace	a50c78a5cf	isl: Add missing break statement in array pitch calculation Fixes regression in ed98c374bd3f1952fbab3031afaf5ff4d178ef41.	2016-01-07 11:08:12 -08:00
Chad Versace	d1e6c1b29b	isl/gen9: Fix array pitch of 3d surfaces For tiled 3D surfaces, the array pitch must aligned to the tile height. From the Skylake BSpec >> RENDER_SURFACE_STATE >> Surface QPitch: Tile Mode != Linear: This field must be set to an integer multiple of the tile height Fixes CTS tests 'dEQP-VK.pipeline.image.view_type.3d.format.r8g8b8a8_unorm.'. Fixes Crucible tests 'func.miptree.r8g8b8a8-unorm.aspect-color.view-3d.'.	2016-01-07 11:04:17 -08:00
Chad Versace	0af77fe5b6	isl: Refactor func isl_calc_array_pitch_sa_rows Update the function to calculate the array pitch is element rows, and it rename it accordingly to isl_calc_array_pitch_el_rows.	2016-01-07 11:04:17 -08:00
Jordan Justen	2f0a10149c	isl: Assert that alignments are not 0 for isl_align Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-07 10:37:35 -08:00
Jordan Justen	4d68c477ad	anv: Assert that alignments are not 0 for align_* Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-07 10:37:35 -08:00
Jordan Justen	be91f23e3b	isl: Fix image alignment calculation The previous code was resulting in an alignment of 0. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-07 10:37:35 -08:00
Jason Ekstrand	d8cd5e333e	anv/state: Pull sampler vk-to-gen maps into genX_state_util.h	2016-01-06 19:53:45 -08:00
Jason Ekstrand	195c60deb4	nir/spirv: Wrap borrow/carry ops in b2i NIR specifies them as booleans but SPIR-V wants ints.	2016-01-06 17:13:06 -08:00
Jason Ekstrand	000eb00862	nir/spirv/cfg: Only set fall to true at the start of a case Previously, we were setting it to true at the top of the switch statement. However, this causes all of the cases to get executed until you hit a break. Instead, you want to be not executing at the start, start executing when you hit your case, and end at a break.	2016-01-06 17:00:55 -08:00
Jordan Justen	de65d4dcaf	anv: Fix build without VALGRIND Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-06 15:54:51 -08:00
Jason Ekstrand	5bbf060ece	i965/compiler: Enable more lowering in NIR We don't need these for GLSL or ARB, but we need them for SPIR-V	2016-01-06 15:30:53 -08:00
Jason Ekstrand	573351cb0f	nir/algebraic: Add more lowering This commit adds lowering options for the following opcodes: - nir_op_fmod - nir_op_bitfield_insert - nir_op_uadd_carry - nir_op_usub_borrow	2016-01-06 15:30:53 -08:00
Jason Ekstrand	1f503603d3	nir/opcodes: Fix the folding expression for usub_borrow	2016-01-06 15:30:53 -08:00
Jason Ekstrand	22804de110	nir/spirv: Properly implement Modf	2016-01-06 15:30:53 -08:00
Jason Ekstrand	1f3593d8a1	nir/builder: Add a helper for storing to a deref	2016-01-06 15:30:53 -08:00
Chad Versace	8284786c5d	anv/gen9: Teach gen9_image_view_init() about 1D surface qpitch QPitch is usually expressed as rows of surface elements (where a surface element is an compression block or a single surface sample. Skylake 1D is an outlier; there QPitch is expressed as individual surface elements.	2016-01-06 09:38:57 -08:00
Chad Versace	e05b307942	isl: Add isl_surf_get_array_pitch_el() Will be needed to program SurfaceQPitch for Skylake 1D arrays.	2016-01-06 09:38:57 -08:00
Chad Versace	c1e890541e	isl/gen9: Support ISL_DIM_LAYOUT_GEN9_1D	2016-01-06 09:38:57 -08:00
Chad Versace	eea2d4d059	isl: Don't align phys_slice0_sa.width twice It's already aligned to the format's block width. Don't align it again in isl_calc_row_pitch().	2016-01-06 09:38:57 -08:00
Chad Versace	39d043f94a	isl: Fix the documented units of isl_surf::row_pitch It's the pitch between surface elements, not between surface samples.	2016-01-06 09:38:57 -08:00
Chad Versace	dcb9c11dc7	anv/gen9: Fix oob lookup of surface halign, valign For 1D surfaces and for surfaces with Yf or Ys tiling, the hardware ignores SurfaceVerticalAlignment and SurfaceHorizontalAlignment. Moreover, the anv_halign[] and anv_valign[] lookup tables may not even contain the surface's actual alignment values. So don't do the lookup for those surfaces.	2016-01-06 09:38:57 -08:00
Chad Versace	94566d9b68	anv/meta: Teach meta how to blit from a 1D image Meta needed a VkShader with a 1D sampler type.	2016-01-06 09:38:57 -08:00
Jason Ekstrand	7a069bea5d	nir/spirv: Fix switch statements with duplicate cases	2016-01-05 16:18:01 -08:00
Jason Ekstrand	506a467f16	nir/spirv/cfg: Assert that blocks only ever get added once This effectively prevents infinite loops in cfg_walk_blocks.	2016-01-05 15:56:59 -08:00
Jason Ekstrand	71a25a0b07	nir/spirv: Simplify phi node handling Instead of trying to crawl through predecessor chains and build phi nodes, we just do a poor-man's out-of-ssa on the spot. The into-SSA pass will deal with putting the actual phi nodes in for us.	2016-01-05 14:59:40 -08:00
Jason Ekstrand	ec899f6b42	anv/pipeline: Lower indirect temporaries and inputs	2016-01-05 13:42:52 -08:00
Jason Ekstrand	bff45dc44e	nir: Add an indirect deref lowering pass	2016-01-05 13:42:52 -08:00
Kristian Høgsberg Kristensen	30521fb19e	vk: Implement a basic pipeline cache This is not really a cache yet, but it allows us to share one state stream for all pipelines, which means we can bump the block size without wasting a lot of memory.	2016-01-05 12:03:21 -08:00
Kristian Høgsberg Kristensen	f551047751	vk: Destroy device->mutex when destroying the device	2016-01-05 12:03:21 -08:00
Chad Versace	8d6f0a1b80	isl: Don't force linear for 1d surfaces in gen7_filter_tiling() gen7_filter_tiling() should filter out only tiling flags that are incompatible with the surface. It shouldn't make performance decisions, such as forcing linear for 1D; that's the role of the caller.	2016-01-05 11:37:32 -08:00
Chad Versace	8135786605	isl: Document gen7_filter_tiling()	2016-01-05 11:35:13 -08:00
Chad Versace	33f06842be	isl: Prefer linear tiling for 1D surfaces	2016-01-05 11:35:13 -08:00
Chad Versace	98af1cc6d7	isl: Remove isl_format_layout::bpb struct isl_format_layout contained two near-redundant members: bpb (bits per block) and bs (block size). There do exist some hardware formats for which bpb != 8 * bs, but Vulkan does not use them. Therefore we don't need bpb.	2016-01-05 10:00:39 -08:00
Chad Versace	89b68dc8d0	anv: Use isl_format_layout::bs instead of ::bpb For all formats used by Vulkan, 8 * bs == bpb. (bs=block_size_in_bytes, bpb=bits_per_block)	2016-01-05 10:00:39 -08:00
Chad Versace	a1d64ae561	isl: Align isl_surf::phys_level0_sa to the format's compression block	2016-01-05 09:52:07 -08:00
Chad Versace	2172f0e9bb	isl: Fix mis-documented units of isl_surf::phys_level_sa It's in physical surface samples. Hence the _sa suffix.	2016-01-05 09:52:07 -08:00
Jason Ekstrand	8b403d599b	nir/spirv: Add support for the ControlBarrier instruction	2016-01-04 22:08:24 -08:00
Jason Ekstrand	ba7b5edc26	anv/UpdateDescriptorSets: Use the correct index for the buffer view	2016-01-04 21:36:11 -08:00
Jason Ekstrand	b8f0bea07a	nir/spirv: Implement extended add, sub, and mul	2016-01-04 20:59:16 -08:00
Jason Ekstrand	3a3c4aecf1	nir/spirv: Add support for bitfield operations	2016-01-04 17:37:10 -08:00
Jason Ekstrand	01ba96e059	nir/spirv: Add support for msb/lsb opcodes	2016-01-04 17:37:10 -08:00
Jason Ekstrand	f32370a536	nir/spirv: Add a documenting assert for OpConstantSampler	2016-01-04 17:37:10 -08:00
Jason Ekstrand	0309199802	nir/spirv: Add initial support for ConstantNull	2016-01-04 17:37:10 -08:00
Chad Versace	8cc21d3aea	isl: Align single-level 2D surfaces to compression block This fixes an assertion failure at isl.c:1003. Reported-by: Nanley Chery <nanley.g.chery@intel.com>	2016-01-04 16:48:58 -08:00
Jason Ekstrand	151694228d	anv/formats: Hand out different formats based on tiled vs. linear	2016-01-04 16:08:05 -08:00
Jason Ekstrand	f665fdf0e7	anv/image_view: Separate vulkan and isl formats Previously, anv_image_view had a anv_format pointer that we used for everything. This commit replaces that pointer with a VkFormat enum copied from the API and an isl_format. In order to implement RGB formats, we have to use a different isl_format for the actual surface state than the obvious one from the VkFormat. Separating the two helps us keep things streight.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	ceb05131da	anv_get_isl_format: Support depth+stencil aspect value You just get the depth format in this case.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	a7cc12910d	anv/image: Do more work in anv_image_view_init There was a bunch of common code in gen7/8_image_view_init that we really should be sharing.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	87dd59e578	anv/formats: Rework GetPhysicalDeviceFormatProperties It now calls get_isl_format to get both linear and tiled views of the format and determines linear/tiled properties from that. Buffer properties are determined from the linear format.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	2712c0cca3	anv/formats: Add a tiling parameter to get_isl_format Currently, this parameter does nothing.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	603a3a9439	isl/format: Add some helpers for working with RGB formats	2016-01-04 16:08:05 -08:00
Jason Ekstrand	0639f44d0f	isl: Add a file for format helpers	2016-01-04 16:08:05 -08:00
Jason Ekstrand	5f5fc23e7c	genX/state: Pull some generic helpers into a shared header	2016-01-04 16:08:05 -08:00
Jason Ekstrand	ad9ff4f2b2	meta/blit: Rework how format and aspect choices are made This commit does two things. First, it introduces choose_* functions for chosing formats and aspects. Second, it changes the copy (not blit) code to use appropreately sized UINT formats for everything except depth. There are two main reasons for this: First, it means that compressed and other non-renderable texture upload should "just work" because it won't be tripping over non-renderable formats. Second, it allows us to easly copy an RGB buffer to and from an RGBX image because the formats will get switched over to their UINT variants and the shader will deal with the extra channel for us.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	3200a81a55	anv/image: Add a vk_format field We've been trying to move away from anv_format for a while and this should help with the transition. There are cases (mostly in meta) where we need the original format for the image and not the isl_format. These will be moved over to the new vk_format and everythign else will use the isl_format from the particular anv_surface.	2016-01-04 16:08:05 -08:00
Chad Versace	0d7614dce6	isl: Document mnemonic in Yf and Ys tiling The 'f' means "four K". The 's' means "sixty-four K".	2016-01-04 15:37:39 -08:00
Kristian Høgsberg Kristensen	0f34a4ec4e	isl: Use isl_align_npot for row_pitch Many formats are not power-of-two bytes per pixels and we need the non-power-of-two align macro here. This reverts the revert from `4f9a211b`, but keeps the change from `a827b553` that fixed the yuv if-else mix-up.	2016-01-04 10:53:47 -08:00
Kristian Høgsberg Kristensen	abc1c9878f	vk: Don't leak pipeline if initialization fails	2016-01-04 10:42:50 -08:00
Kristian Høgsberg Kristensen	fca1c08e34	vk: Allocate subpass attachment in one big block This avoids making a lot of small allocations and handles allocation failure correctly. Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.	2016-01-04 10:07:10 -08:00
Kristian Høgsberg Kristensen	5526c1782a	vk: Handle allocation failures in meta init paths Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.	2016-01-04 10:07:08 -08:00
Kristian Høgsberg Kristensen	b2ad2a20b6	vk: Handle allocation failure in anv_pipeline_init() Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.	2016-01-04 10:06:50 -08:00
Kristian Høgsberg Kristensen	3954594eb4	vk: Call vk_error when we generate a VK_ERROR	2016-01-04 10:02:50 -08:00
Kristian Høgsberg Kristensen	75e01c8b2d	vk: Only finish wayland wsi if we created it Failure during instance creation will leave instance->wayland_wsi undefined. When we then try to clean that up we crash. Set instance->wayland_wsi to NULL on failure and only clean it up if it's non-NULL. Fixes part of dEQP-VK.api.object_management.alloc_callback_fail.*	2016-01-04 10:02:50 -08:00
Chad Versace	05c22f2d74	isl: Fix row pitch for linear buffers isl always aligned the row pitch to the surface's image alignment. This was sometimes wrong when the surface backed a VkBuffer. For a VkBuffer, the surface's row pitch is set by VkBufferImageCopy::bufferRowLength, whose required alignment is only that of the VkFormat. In particular, VkBuffer rows are packed in many dEQP and Crucible tests. And packed rows are rarely aligned to the surface's image alignment. Fixes: dEQP-VK.pipeline.image.view_type.2d.format.r8g8b8a8_unorm.size.13x13	2016-01-04 09:57:25 -08:00
Chad Versace	a827b553d9	isl: Fix swapped if-else in isl_calc_row_pitch The YUV case was applied to non-YUV formats. Oops.	2016-01-04 09:57:23 -08:00
Jason Ekstrand	f6c4658cde	nir/spirv: Fix group decorations They were completely bogus before. For one thing, OpDecorationGroup created a value of type undef rather than decoration_group. Also OpGroupMemberDecorate didn't properly apply the decoration to the different members of the different groups. It should be correct now but there's no good way to test it yet.	2016-01-02 11:53:36 -08:00
Jason Ekstrand	6b0b57225c	anv/device: Only allocate whole pages in AllocateMemory The kernel is going to give us whole pages anyway, so allocating part of a page doesn't help. And this ensures that we can always work with whole pages.	2016-01-02 07:52:24 -08:00
Jason Ekstrand	f076d5330d	anv/device: Handle non-4k-aligned calls to MapMemory As per the spec: minMemoryMapAlignment is the minimum required alignment, in bytes, of host-visible memory allocations within the host address space. When mapping a memory allocation with vkMapMemory, subtracting offset bytes from the returned pointer will always produce a multiple of the value of this limit.	2016-01-01 09:29:29 -08:00
Jason Ekstrand	6b5cbdb317	anv/format: Get rid of num_channels	2015-12-31 12:07:43 -08:00
Jason Ekstrand	3fe1f118f8	anv/cmd_buffer: Fix a pointer-cast typo	2015-12-31 12:07:43 -08:00
Chad Versace	86ecb28ec6	isl: Document some isl_surf::phys_level0_sa invariants isl_dim_layout restricts the range of isl_surf::phys_level0_sa.	2015-12-31 12:06:02 -08:00
Jason Ekstrand	5318424d49	anv/pipeline: Better vertex input channel setup First off, it now uses isl formats instead of anv_format. Also, it properly handles integer vs. floating-point default channels and can properly handle alpha-only channels. (Not sure if those are allowed).	2015-12-31 12:02:08 -08:00
Jason Ekstrand	c6364495b2	anv/pipeline: Move vk_to_gen tables into a shared header	2015-12-31 12:02:08 -08:00
Chad Versace	d25cff687b	isl: Better document surface units Logical pixels, physical surface samples, and physical surface elements. Requested-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-31 11:56:13 -08:00
Chad Versace	373fd89e4b	isl: Document the 3D block extent of isl_format	2015-12-31 11:55:48 -08:00
Jason Ekstrand	1ddcbbf05f	nir/spirv: Add a missing break statement in handle_image	2015-12-30 21:57:04 -08:00
Jason Ekstrand	4f9a211b4a	Revert "isl: Fix assertion failure for npot pixel formats" This reverts commit `96d1baa88d`.	2015-12-30 21:01:55 -08:00
Jason Ekstrand	0bb103d010	nir/spirv: Handle push constants after decorations	2015-12-30 20:54:27 -08:00
Jason Ekstrand	3421ba1843	anv/device: Place memory types at heapIndex == 0 Previously, they were at heapIndex == 1 even though we only advertised one heap.	2015-12-30 19:32:43 -08:00
Jason Ekstrand	cf6ce424e0	nir/spirv: Fix constant num_elements and allocation Thanks to the addition of nir_clone, we now have a num_elements field in nir_constant which we weren't setting. Also, constants have to be parented to the variable they initialize, so we have to make a copy.	2015-12-30 18:51:59 -08:00
Jason Ekstrand	601b7d5f98	nir/lower_outputs_to_temporaries: Reparent constant initializers	2015-12-30 18:51:06 -08:00
Jason Ekstrand	7d57528233	nir/clone: Expose nir_constant_clone	2015-12-30 18:44:19 -08:00
Jason Ekstrand	fed98df428	nir/gather_info: Add support for end_primitive_with_counter	2015-12-30 17:45:43 -08:00
Jason Ekstrand	5afac62b28	nir/spirv: Handle OpLine	2015-12-30 17:45:43 -08:00
Jason Ekstrand	149f35bbba	nir/spirv: Let OpEntryPoint act as an OpName	2015-12-30 17:45:43 -08:00
Jason Ekstrand	5f7f88524c	nir/lower_outputs_to_temporaries: Take a nir_function entrypoint	2015-12-30 17:45:43 -08:00
Jason Ekstrand	0fe4580e64	nir/spirv: Add support for multiple entrypoints per shader This is done by passing the entrypoint name into spirv_to_nir. It will then process the shader as if that were the only entrypoint we care about. Instead of returning a nir_shader, it now returns a nir_function.	2015-12-30 17:45:43 -08:00
Jason Ekstrand	e993e45eb1	nir/spirv: Get the shader stage from the SPIR-V Previously, we depended on it being passed in.	2015-12-30 17:45:43 -08:00
Jason Ekstrand	db3a64fcea	nir/spirv: Use shader stage for determining variable locations	2015-12-30 17:45:43 -08:00
Jason Ekstrand	d7ae2200f9	nir/spirv: Get rid of default GS info shaderc has been fixed for a while now.	2015-12-30 17:45:43 -08:00
Jason Ekstrand	d9c9a117dc	nir/spirv: Handle execution modes as decorations They're basically the same thing.	2015-12-30 17:45:43 -08:00
Jason Ekstrand	2b6bcaf91a	nir/spirv: Separate handling of preamble from type/var/const instructions	2015-12-30 17:45:43 -08:00
Chad Versace	96d1baa88d	isl: Fix assertion failure for npot pixel formats When aligning to isl_format_layout::bs (which is the number of bytes in the pixel), use isl_align_npot() instead of isl_align(), because isl_align() works only for power-of-2 alignment. Fixes assertion in dEQP-VK.pipeline.image.view_type.1d.format.r16g16b16_sfloat.size.512x1.	2015-12-30 16:28:19 -08:00
Jason Ekstrand	07b4f17aaf	nir/spirv/GLSL450: Add support for SAbs	2015-12-30 14:41:49 -08:00
Kenneth Graunke	e6cd0c0e1c	nir/spirv: Implement IsInf and IsNan built-ins.	2015-12-30 14:10:44 -08:00
Jason Ekstrand	a7e827192b	isl: Tile-align height in image size calculation This fixes a bunch of gpu hangs on the dEQP-VK.glsl.ShaderExecutor.common group of CTS tests.	2015-12-30 14:03:47 -08:00
Kenneth Graunke	9f23116bfa	Revert "nir/spirv: Update to the 1.0 GLSL.std.450 header" This reverts commit `b33f5d3889`, and also removes the (empty) case statements for the new built-ins. It doesn't look like glslang has updated yet, so updating the header just breaks everything, as we no longer agree on opcode numbers.	2015-12-30 13:26:56 -08:00
Jason Ekstrand	e6fc170afb	anv/allocator: Rework state streams again If we're going to hav valgrind verify state streams then we need to ensure that once we choose a pointer into a block we always use that pointer until the block is freed. I was trying to do this with the "current_map" thing. However, that breaks down because you have to use the map from the block pool to get to the stream_block to get at current_map. Instead, this commit changes things to track the stream_block by pointer instead of by offset into the block pool.	2015-12-30 11:40:38 -08:00
Jason Ekstrand	28243b2fba	gen7/8/cmd_buffer: Allocate the correct ammount for COLOR_CALC_STATE We were allocating 6 bytes when we should have been allocating 6 dwords.	2015-12-30 10:37:57 -08:00
Jason Ekstrand	a0b2829f20	anv/stream_alloc: Properly manage valgrind NOACCESS and UNDEFINED status When I first did the valgrindifying for stream allocators, I misunderstood some things about valgrind's expectations for NOACCESS and UNDEFINED. First off, valgrind expects things to be marked NOACCESS before you allocate out of them. Since our blocks came from a pool backed by a mmapped memfd, they came in as UNDEFINED; we needed to mark them as NOACCESS. Also, I didn't realize that VALGRIND_MEMPOOL_CHANGE only updated the mempool allocation state and didn't actually change definedness; we had to add a VALGRIND_MAKE_MEM_UNDEFINED to get rid of the NOACCESS on the newly allocated portion.	2015-12-30 10:36:19 -08:00
Kristian Høgsberg Kristensen	91d93f7908	nir/spirv: Lower gl_GlobalInvocationID correctly Use nir_intrinsic_load_local_invocation_id, not nir_intrinsic_load_invocation_id (missing 'local'), which is a geometry shader built-in.	2015-12-30 00:03:54 -08:00
Jason Ekstrand	451fe2670c	nir/spirv/cfg: Handle discard	2015-12-29 19:23:25 -08:00
Jason Ekstrand	5693637faa	nir/print: Handle variables with var->name == NULL	2015-12-29 16:58:00 -08:00
Jason Ekstrand	8cc55780fd	nir/inline_functions: Switch to inlining everything	2015-12-29 16:58:00 -08:00
Kenneth Graunke	7cdcee3bed	nir/spirv/glsl450: Enumerate more built-in opcodes.	2015-12-29 16:06:35 -08:00
Kenneth Graunke	ccd84848f0	anv/state: Fix reversed MIN vs. MAX in levelCount handling. The point is to promote a levelCount of 0 to 1 before subtracting 1. This needs MAX, not MIN.	2015-12-29 15:51:14 -08:00
Jason Ekstrand	2a58cb03d0	nir/spirv: Use instr_rewrite_src for updating phi sources You can't just add a new source to a phi because use/def information won't get updated properly. Instead, you have to use one of the core helpers. Some day, we may want to add a nir_phi_instr_add_src helper.	2015-12-29 15:44:39 -08:00
Jason Ekstrand	69d5838aee	nir/validate: Don't validate the return deref for void function calls	2015-12-29 15:35:29 -08:00
Jason Ekstrand	51b04d03d5	nir/dominance: Handle unreachable blocks Previously, nir_dominance.c didn't properly handle unreachable blocks. This can happen if, for instance, you have something like this: loop { if (...) { break; } else { break; } } In this case, the block right after the if statement will be unreachable. This commit makes two changes to handle this. First, it removes an assert and allows block->imm_dom to be null if the block is unreachable. Second, it properly skips unreachable blocks in calc_dom_frontier_cb.	2015-12-29 15:29:27 -08:00
Kenneth Graunke	b4a1c9b506	nir/spirv/glsl450: Implement inverse hyperbolic trig built-ins.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	2ea111664c	nir/spirv/glsl450: Implement Refract built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	74529a2c50	nir/spirv/glsl450: Implement hyperbolic trig built-ins.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	0b1a436ac8	nir/spirv/glsl450: implement Reflect built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	659a3623b0	nir/spirv/glsl450: Implement FaceForward built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	b10af36d93	nir/spirv/glsl450: Implement SmoothStep.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	6a0fa2d758	nir/spirv/glsl450: Implement Cross built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	083fd6ec2a	nir/spirv/glsl450: Implement Clamp/SClamp/UClamp.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	034010924e	nir/spirv/glsl450: Implement the Log built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	ffc5ae7c9e	nir/spirv/glsl450: Implement Exp built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	227e250005	nir/spirv/glsl450: Add a helper for doing fclamp().	2015-12-29 15:27:03 -08:00
Kenneth Graunke	0f801752f2	nir/spirv/glsl450: Add helpers for calculating exp() and log().	2015-12-29 15:27:03 -08:00
Kenneth Graunke	9c9edd1ce8	nir/spirv/glsl450: Add an 'nb' shortcut variable. "nb" is shorter and more convenient than "&b->nb", especially when several operations are composed together into a larger expression tree.	2015-12-29 15:27:03 -08:00
Jason Ekstrand	5f04a61219	nir/lower_returns: Don't just change the type of a jump. It doesn't give core NIR the opportunity to update predecessors and successors. Instead, we have to remove and re-insert the instruction.	2015-12-29 14:51:47 -08:00
Jason Ekstrand	6fa47c9c17	nir/builder: Add a nir_jump helper	2015-12-29 14:48:34 -08:00
Jason Ekstrand	37a38548d4	glsl/types.cpp: Fix function_key_compare	2015-12-29 14:32:10 -08:00
Jason Ekstrand	b33f5d3889	nir/spirv: Update to the 1.0 GLSL.std.450 header	2015-12-29 14:29:03 -08:00
Jason Ekstrand	a33fcc0fd4	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in nir_builder_init_simple_shader and allows us to delete anv_nir_builder.h entirely.	2015-12-29 13:53:41 -08:00
Jason Ekstrand	5dd4386b92	nir/spirv: Use a C99-style initializer for structure fields This ensures that all unknown fields get zero-initizlied so we don't have undefined values floating around.	2015-12-29 13:15:20 -08:00
Jason Ekstrand	e10b0e2b49	anv/pipeline: Use vs_prog_data.inputs_read when computing vb_used	2015-12-29 13:03:01 -08:00
Jason Ekstrand	0a2ab87947	nir/spirv: Move CF emit code into vtn_cfg.c	2015-12-29 12:50:31 -08:00
Jason Ekstrand	4e22cd2e32	nir/spirv: Add support for switch statements	2015-12-29 12:50:31 -08:00
Jason Ekstrand	cf555dc1c2	nir/spirv: A couple simple loop fixes	2015-12-29 12:50:31 -08:00
Jason Ekstrand	303d095f58	nir/spirv: Add an actual CFG data structure The current data structure doesn't handle much that we couldn't handle before. However, this will be absolutely crucial for doing swith statements. Also, this should fix structured continues.	2015-12-29 12:50:31 -08:00
Jason Ekstrand	bbf99511d0	gen7/8/pipeline: s/vb_used/elements in emit_vertex_input	2015-12-29 09:40:22 -08:00
Kristian Høgsberg Kristensen	fc03723bcd	vk: Fill out buffer surface state when updating descriptor set We can do this when we update the descriptor set instead of on the fly.	2015-12-28 21:57:56 -08:00
Kristian Høgsberg Kristensen	a00524a216	vk: Unstub VkSemaphore implementation There really is nothing to do for us here, at least with the current kernel interface.	2015-12-28 21:57:56 -08:00
Jason Ekstrand	5fab35d090	gen7/pipeline: Actually use inputs_read from the VS for laying out inputs	2015-12-28 18:21:11 -08:00
Jason Ekstrand	b090f9dce1	gen8/pipeline: Actually use inputs_read from the VS for laying out inputs	2015-12-28 18:21:11 -08:00
Jason Ekstrand	3eb108ef87	anv/meta: Fix the pos_out location for the vertex shader	2015-12-28 18:21:11 -08:00
Jason Ekstrand	b005fd62f9	nir/spirv: Add GLSL.std.450.h It accidentally got removed during the mass rename.	2015-12-28 15:46:22 -08:00
Jason Ekstrand	9c84b6cce0	anv/device: Set device->info sooner in CreateDevice anv_block_pool_init calls anv_block_pool_grow which checks device->info.has_llc to see if it needs to set caching parameters. If we don't set device->info early enough, this reads an undefined value which is probably 0 and not what we want on llc platforms. Found with valgrind.	2015-12-28 13:29:01 -08:00
Jason Ekstrand	763176a3e2	nir/lower_returns: Fix a bug in loop lowering	2015-12-28 13:22:09 -08:00
Jason Ekstrand	7aaed91581	nir/spirv: Move to its own directory	2015-12-28 11:49:39 -08:00
Jason Ekstrand	d5fa51bdee	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in the removal of nir_function_overload	2015-12-28 10:56:31 -08:00
Jason Ekstrand	d9dcfafacc	nir/spirv: Use nir_build_alu for alu instructions	2015-12-28 10:35:31 -08:00
Jason Ekstrand	ea77b384e8	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in tessellation and the store_var changes that go with it.	2015-12-27 23:23:05 -08:00
Jason Ekstrand	f948767471	nir/lower_returns: Better algorithm as per connor	2015-12-27 22:50:45 -08:00
Jason Ekstrand	3489f66056	nir: Add a cursor helper for getting a cursor after any phi nodes	2015-12-27 22:50:14 -08:00
Jason Ekstrand	c60456dfaa	nir/gather_info: Handle multi-slot variables in io bitfields	2015-12-24 00:47:20 -08:00
Jason Ekstrand	bbebd2de13	nir: Add a helper for getting the bitmask for a variable's location	2015-12-24 00:47:20 -08:00
Jason Ekstrand	4ff4310a78	nir/types: Expose glsl_type::count_attribute_slots()	2015-12-24 00:47:19 -08:00
Jason Ekstrand	0bc1b0fd23	nir/lower_return: Do it for real this time	2015-12-24 00:47:19 -08:00
Jason Ekstrand	e1b1d58bec	nir/cf: Make extracting or re-inserting nothing a no-op	2015-12-23 23:46:04 -08:00
Jason Ekstrand	eae352e75c	nir: Add a function for comparing cursors	2015-12-23 18:09:42 -08:00
Jason Ekstrand	54c870ff61	nir/spirv: Add support for undefs in vtn_ssa_value()	2015-12-23 14:14:39 -08:00
Jason Ekstrand	2e823d5754	nir/spirv: Properly handle vector times matrix	2015-12-23 13:49:56 -08:00
Jason Ekstrand	452ba4db2b	nir/spirv: Create the correct type if a matrix-vector multiply produces a vector	2015-12-23 13:49:56 -08:00
Jason Ekstrand	5b30132388	nir/spirv: Fix some mem_ctx issues with create_vec	2015-12-23 13:49:56 -08:00
Jason Ekstrand	66168a798b	nir/spirv: Better document vtn_ssa_value.transposed	2015-12-23 13:49:56 -08:00
Jason Ekstrand	3b391892aa	anv/descriptor_set: Use anv_foreach_stage	2015-12-23 13:49:56 -08:00
Jason Ekstrand	72ceb99bab	anv: Mask out invalid stages in foreach_stage	2015-12-23 13:49:56 -08:00
Jason Ekstrand	5644b1cece	nir/spirv: Handle LogicalNot	2015-12-23 13:49:56 -08:00
Jason Ekstrand	6219a69589	nir/spirv: Handle derefs in vtn_ssa_value This is kind of a hack, but it makes vtn_ssa_value insert a load if the value requested is actually a deref. This shouldn't happen normally but, thanks to the impedence mismatch of the NIR function parameter model vs. the SPIR-V model, this can happen for function arguments.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	3ab1b7afa8	nir/spirv: Do boolean fixup on block loads We used to do it for variable loads on things of type "uniform" but that never got ported to block loads.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	af74ce5a19	spirv/nir: Handle non-vector extractions in vtn_composite_extract	2015-12-23 13:49:56 -08:00
Jason Ekstrand	79b8b42081	nir/spirv: Handle function calls	2015-12-23 13:49:56 -08:00
Jason Ekstrand	95990c96cc	nir: Create the params array in function_impl_create	2015-12-23 13:49:56 -08:00
Jason Ekstrand	a7f3e113ad	i965/nir: Remove return handling This was added because we were getting spurrious returns coming out of SPIR-V. Now that we're calling lower_returns, we don't need this.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	ac975b73cf	anv/pipeline: Run lower_returns and inline_functions after spirv_to_nir	2015-12-23 13:49:56 -08:00
Jason Ekstrand	8fba4bf79f	nir: Add a function inlining pass	2015-12-23 13:49:56 -08:00
Jason Ekstrand	b21db9cea5	nir/builder: Add a copy_deref_var helper	2015-12-23 13:49:56 -08:00
Jason Ekstrand	23cfa683d5	nir: move nir_copy_var from anv_nir_builder to nir_builder	2015-12-23 13:49:56 -08:00
Jason Ekstrand	4aac03fe61	nir/clone: Add support for cloning a single function_impl This will be useful for things such as function inlining.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	98291b8f2c	nir: Add a helper for creating a "bare" nir_function_impl This is useful if you want to clone a single function_impl if, for instance, you wanted to do function inlining.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	86772c2488	nir/control_flow: Handle relinking top-level blocks This can happen if a function ends in a return instruction and you remove the return.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	1749e667ea	nir: Add a stub function inlining pass All it does is remove the return at the end, but it's good enough for simple functions.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	413a9d3517	nir/print: Factor variable name lookup into a helper Otherwise, we have a problem when we go to print functions with arguments because their names get added to the hash table during declaration which happens after we print the prototype.	2015-12-23 13:49:56 -08:00
Kristian Høgsberg Kristensen	220ac9337b	vk: Only require wc bo mmap for !llc GPUs	2015-12-19 22:25:57 -08:00
Kristian Høgsberg Kristensen	b49aaf5de0	vk: Remove stale 48 bit addresses FIXMEs This has worked fine for a long time.	2015-12-19 22:20:45 -08:00
Kristian Høgsberg Kristensen	c4802bc44c	vk/gen8: Implement VkEvent for gen8 We use PIPE_CONTROL for setting and resetting the event from cmd buffers and MI_SEMAPHORE_WAIT in polling mode for waiting on an event.	2015-12-19 22:17:19 -08:00
Kristian Høgsberg Kristensen	8ac46d84ff	vk: Fix check for I915_PARAM_MMAP_VERSION Comparing the wrong thing for < 1.	2015-12-18 17:24:19 -08:00
Jordan Justen	5e82a91324	anv/gen8: Add support for gl_NumWorkGroups Co-authored-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-18 01:45:11 -08:00
Jason Ekstrand	d7f66f9f6f	nir/spirv: Array lengths are constants not literals	2015-12-17 16:36:29 -08:00
Jason Ekstrand	1473a8dc6f	anv/formats: Add more 64-bit formats	2015-12-17 13:51:09 -08:00
Jason Ekstrand	167809365b	anv/formats: Add more PACK32 formats	2015-12-17 13:44:50 -08:00
Jason Ekstrand	952bf05897	anv/image: Properly report buffer features	2015-12-17 11:52:31 -08:00
Jason Ekstrand	3395ca17d1	isl: Add a is_storage_image_format helper	2015-12-17 11:45:04 -08:00
Jason Ekstrand	b1325404c5	anv/device: Handle zero-sized memory allocations	2015-12-17 11:00:38 -08:00
Jason Ekstrand	c643e9cea8	anv/state: Allow levelCount to be 0 This can happen if the client is creating an image view of a textureable surface and they only ever intend to render to that view.	2015-12-16 17:34:57 -08:00
Jason Ekstrand	b2fe8b4673	nir/spirv: Add a missing break statement	2015-12-15 17:24:18 -08:00
Jason Ekstrand	1c51d91bfe	anv/pipeline: Allow the user to pass a null MultisampleCreateInfo According to section 5.2 of the Vulkan spec, this is allowed for color-only rendering pipelines.	2015-12-15 16:26:10 -08:00
Jason Ekstrand	d61ff1ed08	anv/descriptor_set: Initialize immutable_samplers to NULL Previously this wasn't a problem. However, with the new API update, descriptor sets can now be sparse so the client doesn't have to provide an entry for every binding. This means that it's possible for a binding to be uninitialized other than the memset. In that case, we want to have a null array of immutable samplers.	2015-12-15 16:24:22 -08:00
Jason Ekstrand	28c4ef9d6c	anv/device: Bump the size of the instruction block pool Some CTS test shaders were failing to compile. At some point soon, we really need to make a real pipeline cache and stop using a block pool for this.	2015-12-15 11:49:28 -08:00
Jason Ekstrand	306abbead3	anv/pipeline: Properly set IncludeVertexHandles in 3DSTATE_GS	2015-12-15 11:37:18 -08:00
Jason Ekstrand	2d4b7eda23	nir/spirv: Add support for more CS intrinsics	2015-12-15 10:20:23 -08:00
Jason Ekstrand	1035108a7f	nir/lower_system_values: Add support for computed builtins. In particular, this commit adds support for computing gl_GlobalInvocationID and gl_LocalInvocationIndex from other intrinsics.	2015-12-15 10:20:23 -08:00
Jason Ekstrand	630b9528b3	shader_enums: Add enums for gl_GlobalInvocationID and gl_LocalInvocationIndex	2015-12-15 10:20:23 -08:00
Jason Ekstrand	7ebd84fa4b	nir/lower_system_values: Refactor and use the builder. Now that we have a helper in the builder for system values and a helper in core NIR to get the intrinsic opcode, there's really no point in having things split out into a helper function. This commit "modernizes" this pass to use helpers better and look more like newer passes.	2015-12-15 10:20:23 -08:00
Jason Ekstrand	c26e889a44	nir/builder: Add a load_system_value helper While we're at it, go ahead and make nir_lower_clip use it. Cc: Rob Clark <robclark@gmail.com>	2015-12-15 10:20:23 -08:00
Jason Ekstrand	de67456d6d	nir/lower_system_values: Stop supporting non-SSA The one user of this (i965) only ever calls it while in SSA form.	2015-12-15 10:20:23 -08:00
Chad Versace	64f0ee73e0	isl: Add func isl_surf_get_image_offset_sa The function calculates the offset to a subimage within the surface, in units of surface samples. All unit tests pass with `make check`. (Admittedly, though, there are too few unit tests).	2015-12-15 08:46:09 -08:00
Chad Versace	53504b884e	isl: Fix calculation of array pitch for layout GEN4_2D The height of the miptree's right half was not large enough. Found by `make check` in test_isl_surf_get_offset, which is added in the next commit.	2015-12-15 08:46:09 -08:00
Chad Versace	f7e36f9f66	isl: Move it a standalone directory The plan all along was to eventualyl move isl out of the Vulkan directory, because I intended i965 and anvil to share it. A small problem I encountered when attempting to write unit tests for isl precipitated the move. I discovered that it's easier to get isl unit tests to build if I remove the extra, unneeded dependencies injected by src/vulkan/Makefile.am. And the easiest way to remove those unneeded dependencies is to move isl out of src/vulkan. (Unit tests come in subsequent commits).	2015-12-15 08:45:49 -08:00
Jason Ekstrand	8224571ef8	vec4/generator: Actually pass the sampler into generate_tex This is an artifact of the way the separate samplers/textures series ended up getting sent out and rebased. This should fix a number of CTS tests involving geometry shaders.	2015-12-14 21:13:52 -08:00
Jordan Justen	7edcc59a7b	anv: Rename gs_vec4 to gs_kernel The code generated may be vec4 or simd8 depending on how we start the compiler. To run the GS in SIMD8, set the INTEL_SCALAR_GS environment variable. This was added in: commit `36fd653817` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Wed Mar 11 23:14:31 2015 -0700 i965: Add scalar geometry shader support. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 18:23:14 -08:00
Jordan Justen	a3c5c339a8	nir/spirv_to_nir: Use a minimum of 1 for GS invocations glslang is giving us 0, which causes the SIMD8 GS compile to hit an assert. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 18:23:14 -08:00
Jason Ekstrand	f46544dea1	anv: Fix CUBE storage images	2015-12-14 16:59:59 -08:00
Jason Ekstrand	783a21192c	anv: Add support for storage texel buffers	2015-12-14 16:51:12 -08:00
Jason Ekstrand	1f98bf8da0	anv: Pass an isl_format into fill_buffer_surface_state	2015-12-14 16:14:20 -08:00
Jason Ekstrand	091b6156dd	i965/fs: Push small uniform arrays Unfortunately, this also means that we need to use a slightly different algorithm for assign_constant_locations. The old algorithm worked based on the assumption that each read of a uniform value read exactly one float. If it encountered a MOV_INDIRECT, it would immediately bail and push the whole thing. Since we can now read ranges using MOV_INDIRECT, we need to be able to push a series of floats without breaking them up. To do this, we use an algorithm similar to the on in split_virtual_grfs.	2015-12-14 15:58:10 -08:00
Jason Ekstrand	63c313de84	i965/fs: Rename demote_pull_constants to lower_constant_loads	2015-12-14 15:58:10 -08:00
Jason Ekstrand	75f33a6420	i965/vec4: Get rid of the uniform_size array	2015-12-14 15:58:09 -08:00
Jason Ekstrand	eb76f226cf	i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD	2015-12-14 15:58:09 -08:00
Jason Ekstrand	a487f0284f	i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend.	2015-12-14 15:58:09 -08:00
Jason Ekstrand	46f5396846	i965/vec4: Inline get_pull_constant_offset It's not really doing enough anymore to justify a helper function.	2015-12-14 15:58:09 -08:00
Jason Ekstrand	9c36c40845	i965/fs: Get rid of the param_size array	2015-12-14 15:58:09 -08:00
Jason Ekstrand	9024353db3	i965/fs: Stop relying on param_size in assign_constant_locations Now that we have MOV_INDIRECT opcodes, we have all of the size information we need directly in the opcode. With a little restructuring of the algorithm used in assign_constant_locations we don't need param_size anymore. The big thing to watch out for now, however, is that you can have two ranges overlap where neither contains the other. In order to deal with this, we make the first pass just flag what needs pulling and handle assigning pull constant locations until later.	2015-12-14 15:58:09 -08:00
Jason Ekstrand	9f46af9e41	i965/fs: Get rid of reladdr We aren't using it anymore.	2015-12-14 15:58:09 -08:00
Jason Ekstrand	a3cd95a884	i965/fs: Use MOV_INDIRECT for all indirect uniform loads Instead of using reladdr, this commit changes the FS backend to emit a MOV_INDIRECT whenever we need an indirect uniform load. We also have to rework some of the other bits of the backend to handle this new form of uniform load. The obvious change is that demote_pull_constants now acts more like a lowering pass when it hits a MOV_INDIRECT.	2015-12-14 15:58:09 -08:00
Jordan Justen	c4219bc6ff	anv/cmd_buffer: Gen 8 requires 64 byte alignment for push constant data See MEDIA_CURBE_LOAD, CURBE Data Start Address & CURBE Total Data Length Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 15:39:07 -08:00
Jason Ekstrand	f0313a5569	anv: Add initial support for cube maps This fixes 486 cubemap CTS tests.	2015-12-14 15:36:30 -08:00
Jason Ekstrand	7ba70b1b51	nir: Add another index to load_uniform to specify the range read	2015-12-14 14:28:31 -08:00
Jason Ekstrand	4115648a6b	i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT	2015-12-14 14:28:31 -08:00
Jason Ekstrand	2f1455dbb0	i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware While we're at it, we also add support for the possibility that the indirect is, in fact, a constant. This shouldn't happen in the common case (if it does, that means NIR failed to constant-fold something), but it's possible so we should handle it.	2015-12-14 14:28:31 -08:00
Jason Ekstrand	4be9a1c7bb	i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr The subnr field is in bytes so we don't need to multiply by type_sz.	2015-12-14 14:28:31 -08:00
Jason Ekstrand	653d8044ab	i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions It should work fine without it and the visitor can set it if it wants.	2015-12-14 14:28:30 -08:00
Jason Ekstrand	2c90f08bf7	i965/fs: Add support for doing MOV_INDIRECT on uniforms	2015-12-14 14:28:30 -08:00
Jason Ekstrand	dba28da075	anv/buffer_view: Store a bo + offset instead of buffer pointer This is what image_view does. Also, we really need to do this so that we can properly handle the combined offsets from the buffer and from pCreateInfo. This fixes some of the nonzero offset buffer view CTS tests.	2015-12-14 14:10:40 -08:00
Chad Versace	ee57062e1e	anv: Remove anv_image::surface_type When building RENDER_SURFACE_STATE, the driver set SurfaceType = anv_image::surface_type, which was calculated during anv_image_init(). This was bad because the value of anv_image::surface_type was taken from a gen-specific header, gen8_pack.h, even though the anv_image structure is used for all gens. Replace anv_image::surface_type with a gen-specific lookup function, anv_surftype(), defined in gen${x}_state.c. The lookup function contains some useful asserts that caught some nasty bugs in anv meta, which were fixed in the previous commit.	2015-12-14 10:46:27 -08:00
Chad Versace	f0d11d5a81	anv/meta: Fix VkImageViewType Meta unconditionally used VK_IMAGE_VIEW_TYPE_2D in the functions below. This caused some out-of-bound memory accesses. anv_CmdCopyImage anv_CmdBlitImage anv_CmdCopyBufferToImage anv_CmdClearColorImage Fix it by adding a new function, anv_meta_get_view_type().	2015-12-14 09:03:58 -08:00
Chad Versace	0bebaeacd7	isl: Rename s/lod_align/image_align/ for consistency Regarding the subimages within a surface, sometimes isl called them "images" and sometimes "LODs". This patch make isl consistently refer to them as "images". I choose the term "image" over "LOD" because LOD is an misnomer when applied to 3D surfaces. The alignment applies to each individual 2D subimage, not to the LOD as a whole. This patch changes no behavior. It's just a manually performed, case-insensitive, replacement s/lod/image/ that maintains correct indentation. any behavior.	2015-12-14 09:01:51 -08:00
Chad Versace	85a6384014	anv/tests: gitignore block_pool_no_free	2015-12-14 09:00:28 -08:00
Chad Versace	0da776b733	anv: Fix build for unit tests Clearly no one has been running `make check`, because the unittestbuild has been broken for a long time. After this buildfix, all tests now pass.	2015-12-14 09:00:28 -08:00
Jason Ekstrand	c56186026f	anv: Add initial support for texel buffers	2015-12-12 16:11:23 -08:00
Jason Ekstrand	fd944197f2	i965/nir: Provide a default LOD for buffer textures Our hardware requires an LOD for all texelFetch commands even if they are on buffer textures. GLSL IR gives us an LOD of 0 in that case, but the LOD is really rather meaningless. This commit allows other NIR producers to be more lazy and not provide one at all.	2015-12-12 16:09:54 -08:00
Jason Ekstrand	1c605c8dfa	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in a shared local memory fix.	2015-12-11 14:29:13 -08:00
Jason Ekstrand	d12ea21dd5	gen8/pipeline: Support vec4 vertex shaders In order to actually get them, you need INTEL_DEBUG=vec4.	2015-12-11 13:25:17 -08:00
Kristian Høgsberg Kristensen	e803276148	Revert "i965/HACK: Build brw_cs into libcompiler" This reverts commit `6df7963531`.	2015-12-11 13:09:42 -08:00
Kristian Høgsberg Kristensen	21d5e52da8	Merge ../mesa into vulkan	2015-12-11 13:09:06 -08:00
Jason Ekstrand	6ae4e59fac	anv/pipeline: Get rid of the no kernel input parameters hack Previously, meta would pass null shaders in for the VS when it intended to disable the VS. However, this meant that we didn't know what inputs we had and would dead-code things in the FS. In order to solve this, we hard-coded a number. Now meta passes in a VS even if it plans to disable the stage so this is no longer needed.	2015-12-10 22:37:30 -08:00
Jason Ekstrand	bd0e25d41e	anv/apply_pipeline_layout: Multiply uniform sizes by 4 This is because uniforms are now in terms of bytes everywhere.	2015-12-10 22:36:49 -08:00
Jason Ekstrand	6df7963531	i965/HACK: Build brw_cs into libcompiler We need it for CS push constants	2015-12-10 22:36:07 -08:00
Jason Ekstrand	21cf55ab54	gen8/cmd_buffer: Don't push CS constants if there aren't any Issuing MEDIA_CURB_LOAD with a size of zero causes GPU hangs on BDW.	2015-12-10 18:56:27 -08:00
Jason Ekstrand	3893e11f4b	anv: Use 4 instead of sizeof(gl_constant_value) We no longer have access to gl_constant_value and, really, it's 4 because our uniform layout code works entirely in dwords.	2015-12-10 18:55:16 -08:00
Jason Ekstrand	13d1dd465c	nir/spirv: Put SSBO store writemasks in the right index It moved with the nir_intrinsic_load/store update.	2015-12-10 18:54:44 -08:00
Jason Ekstrand	d5c9955d3e	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in nir_intrinsic_load/store changes and the switch of all uniforms in i965 to bytes. This accounts for the Vulkan changes.	2015-12-10 18:29:36 -08:00
Jason Ekstrand	8beea9d45b	anv/icd: Advertise the right ABI version	2015-12-10 12:27:13 -08:00
Chad Versace	5ba9121fe8	anv/image: Remove some vkCreateImage validation Don't validate the baseArrayLayer and layerCount of cube images. This allows us to remove a bloated lookup table and an unneeded struct definition (anv_image_view_info).	2015-12-09 16:33:23 -08:00
Chad Versace	9a9c551f3e	anv/image: Drop unused halign, valign lookup tables	2015-12-09 15:36:39 -08:00
Jason Ekstrand	46bcf9d777	vulkan: Pull in the 0.210.1 vk_platform header Somehow this got missed in the API update.	2015-12-09 11:55:38 -08:00
Jordan Justen	47e5fb52f4	gen8/compute: Setup push constants and local ids Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-09 11:04:30 -08:00
Jordan Justen	f8d5fb4293	anv: Add anv_cmd_buffer_cs_push_constants Similar to anv_cmd_buffer_push_constants, but handles the compute pipeline, which requires different setup from the other stages. This also handles initializing the compute shader local IDs. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-09 11:02:20 -08:00
Jordan Justen	974bdfa9ad	i965: Move brw_cs_fill_local_id_payload to brw_compiler.h Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-08 18:09:31 -08:00
Jordan Justen	d28df86c87	anv/compute: Fix thread width max off by 1 See cooresponding code in: commit `8d87070af2` Author: Jordan Justen <jordan.l.justen@intel.com> Date: Thu Aug 28 14:47:19 2014 -0700 i965/cs: Implement brw_emit_gpgpu_walker Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-08 18:09:31 -08:00
Chad Versace	db66424218	anv: Remove unused anv_image_view_info_for_vk_image_view_type()	2015-12-08 14:25:28 -08:00
Jason Ekstrand	f4aee5d82f	gen8/cmd_buffer: Flush push constants after descriptor sets This is because, if storage images are used, flushing descriptor sets can cause push constants to become dirty.	2015-12-07 21:45:43 -08:00
Jason Ekstrand	43ac954e25	anv: Add initial support for pushing image params The helper to fill out the image params data-structure is stilly a dummy, but this puts the infastructure in place.	2015-12-07 21:08:26 -08:00
Jason Ekstrand	1eb731d9fe	anv/descriptor_set: Add support for storage images in layouts	2015-12-07 21:08:26 -08:00
Jason Ekstrand	ff05f634f6	anv/image: Add a separate storage image surface state Thanks to hardware limitations, storage images may need a different surface format and/or other bits in the surface state.	2015-12-07 21:08:22 -08:00
Jason Ekstrand	8f83222d37	isl: Add initial support for storage images	2015-12-07 21:08:08 -08:00
Jason Ekstrand	42b4417031	HACK/i965: Disable assign_var_locations on uniforms This conflicts with the way we're doing uniforms in Vulkan.	2015-12-07 17:19:55 -08:00
Jason Ekstrand	cd75ff5d17	anv/pipeline: Only apply a pipeline layout if we have one	2015-12-07 16:56:02 -08:00
Chad Versace	9098e0f074	anv/image: Refactor anv_image_make_surface() Reduce the number of function parameters. Deduce the anv_image::*_surface from the parameters instead of requiring the caller to do that.	2015-12-07 09:28:14 -08:00
Chad Versace	3d85a28e90	anv: Assert the succes of isl_surf_init()	2015-12-07 08:54:59 -08:00
Chad Versace	64e8af69b1	anv: Use isl_tiling_flags in anv_image_create_info Replace anv_image_create_info::force_tiling anv_image_create_info::tiling with the bitmask anv_image_create_info::isl_tiling_flags This allows us to drop the function anv_image.c:choose_isl_tiling_flags().	2015-12-07 08:50:28 -08:00
Chad Versace	c97d8af9aa	anv: Fix anv_gem_set_tiling to respect tiling param Function anv_gem_set_tiling() ignored its 'tiling' parameter. It unconditionally set the bo's tiling to I915_TILING_X.	2015-12-07 08:42:11 -08:00
Chad Versace	01e2932d6a	anv: Remove unused anv_format_s8_uint This is no longer needed after migrating to isl.	2015-12-07 08:40:14 -08:00
Kristian Høgsberg Kristensen	0a5bee1fe6	vk: Don't override and hardcode autoconf CFLAGS To disable optimizations pass CFLAGS="-O0 -g" on the configure command line.	2015-12-04 21:24:15 -08:00
Kristian Høgsberg Kristensen	7337870036	vk: Move isl files to libisl.la helper library These will be in their own library eventually - let's just do that now.	2015-12-04 21:24:15 -08:00
Chad Versace	2f270f0d15	anv/image: Fix choice of isl_surf_usage for depthstencil images Fixes assertion in vkCreateImage when VkFormat is combined depthstencil. Fixed many vulkancts tests that use combined depthstencil. For example, fixes dEQP-VK.pipeline.depth.format.d16_unorm_s8_uint.compare_ops.\ not_equal_less_or_equal_not_equal_greater.	2015-12-04 16:37:05 -08:00
Chad Versace	a09b4c298c	anv: Add func anv_get_isl_format()	2015-12-04 16:37:05 -08:00
Chad Versace	8b9ceda9f1	anv/image: Delete old ifdef'd out code	2015-12-04 16:37:05 -08:00
Jason Ekstrand	4dd5ef9e09	vk: Add needed builddir subdirectories to the include path This fixes out-of-tree builds and closes #1	2015-12-04 15:48:27 -08:00
Kristian Høgsberg Kristensen	f1f78a371e	vk: gem handles are uint32_t No functional difference, but lets be consistent with the kernel API.	2015-12-04 12:53:27 -08:00
Jason Ekstrand	8f722c2fa3	vk: Update the README for 0.210.1	2015-12-04 11:08:45 -08:00
Kristian Høgsberg	dac57750db	vk: Turn on Bay Trail, Cherryview and Broxton support	2015-12-04 09:51:47 -08:00
Kristian Høgsberg Kristensen	bbb6875f35	vk: Map uncached, coherent memory as write-combine This gives us the required characteristics for the memory type.	2015-12-04 09:51:47 -08:00
Kristian Høgsberg Kristensen	c3c61d210f	vk: Expose two memory types for non-LLC GPUs We're required to expose a host-visible, coherent memory type. On big core GPUs that share, LLC, we can expose one such memory type that's also cached. However, on non-LLC GPUs we can't both be cached and coherent. Thus, we expose both the required coherent type and the cached but non-coherent combination.	2015-12-04 09:51:47 -08:00
Kristian Høgsberg	773592051b	vk: clflush all state for non-LLC GPUs	2015-12-04 09:51:47 -08:00
Kristian Høgsberg	b431cf59a3	vk: Set I915_CACHING_NONE for userptr BOs when !llc Regular objects are created I915_CACHING_CACHED on LLC platforms and I915_CACHING_NONE on non-LLC platforms. However, userptr objects are always created as I915_CACHING_CACHED, which on non-LLC means snooped. That can be useful but comes with a bit of overheard. Since we're eplicitly clflushing and don't want the overhead we need to turn it off.	2015-12-04 09:51:47 -08:00
Kristian Høgsberg	e0b5f0308c	vk: Implement vkFlushMappedMemoryRanges() We'll do a runtime switch on device->info.has_llc for now.	2015-12-04 09:51:47 -08:00
Jason Ekstrand	cb2382882e	nir/spirv: Update to SPIR-V version 1.0	2015-12-03 18:28:10 -08:00
Chad Versace	371fc2bc20	anv/gen9: Fix SURFACE_STATE halign and valign Pre-Skylake, RENDER_SUFFACE_STATE.SurfaceVerticalAlignment is in units of surface samples. A surface sample is equivalent to a pixel in all surfaces except interleaved multisample surfaces. In Skylake, it is in units of surface elements. A surface element is equivalent to a surface sample except for compressed formats, in which case the element is a compression block.	2015-12-03 15:33:08 -08:00
Chad Versace	981ef2f02d	anv: Embed isl_surf into anv_surface This reduces struct anv_surface to just two members: an offset and the embedded isl_surf.	2015-12-03 15:31:00 -08:00
Chad Versace	594e673fcc	anv/image: Drop assertions on SURFTYPE extent limits In anv_image_create(), stop asserting that VkImageCreateInfo::extent does not exceed the hardware limits for the given SURFTYPE. The assertions were incorrect because they did not take into account the hardware gen. Anyways, these types of assertions belong in isl, not anvil.	2015-12-03 15:29:52 -08:00
Chad Versace	b369389640	anv/image: Use isl to calculate surface layout Remove the surface layout calculations in anv_image_make_surface(). Let isl_surf_init() do the heavy lifting. Fixes 8 Crucible tests and regresses none. (hw=Broadwell and crucible@33d91ec).	2015-12-03 15:29:08 -08:00
Chad Versace	afdadec77f	isl: Implement isl_surf_init() for gen4-gen9 This is a big code push. The patch is about 3000 lines. Function isl_surf_init() calculates the physical layout of a surface. The implementation is "complete" (but untested) for all 1D, 2D, 3D, and cube surfaces for gen4 through gen9, except: * gen9 1D surfaces * gen9 Ys multisampled surfaces * auxiliary surfaces (such as hiz, mcs, ccs)	2015-12-03 15:26:11 -08:00
Chad Versace	bda43a0f59	isl: Rename legacy Y tiling to ISL_TILING_Y0 Rename legacy Y tiling from ISL_TILING_Y to ISL_TILING_Y0 in order to clearly distinguish it from Yf and Ys. Using ISL_TILING_Y to denote legacy Y tiling would lead to confusion with i965, because i965 uses I195_TILE_Y to denote any Y tiling.	2015-12-03 15:26:11 -08:00
Chad Versace	57941b61ab	anv/image: Vulkan's depthPitch is in bytes, not rows Fix for VkGetImageSubresourceLayout.	2015-12-03 15:26:11 -08:00
Jason Ekstrand	bfeaf67391	anv/device: Give a version of 0.210.1 in apiVersion	2015-12-03 15:23:33 -08:00
Jason Ekstrand	d666487dc6	vk: Add new WSI support and bump the API to 0.210.1	2015-12-03 15:15:29 -08:00
Jason Ekstrand	fde60c1684	anv/entrypoints: Run the headers through the preprocessor first This allows us to filter based on preprocessor directives. We could build a partial preprocessor into the generator, but we would likely get it wrong. This allows us to filter out, for instance, windows-specific WSI stuff.	2015-12-03 14:13:55 -08:00
Jason Ekstrand	4c19243562	vk/0.210.0: Advertise version 0.210.0	2015-12-03 13:44:02 -08:00
Jason Ekstrand	888744cabf	vk/0.210.0: Update queries to the new API	2015-12-03 13:44:02 -08:00
Jason Ekstrand	924fbfc9a1	vk/0.210.0: Fix how we handle access flags in barriers The initial implementation in the 0.210.0 API update was misguieded as to what the access flags meant. This should be more correct.	2015-12-03 13:44:02 -08:00
Jason Ekstrand	fa2435de3c	vk/0.210.0: Update the VkFormat enum	2015-12-03 13:44:02 -08:00
Jason Ekstrand	4e904a0310	vk/0.210.0: Rework vkQueueSubmit	2015-12-03 13:44:02 -08:00
Jason Ekstrand	5757ad2959	vk/0.210.0: Remove depth clip and add depth clamp	2015-12-03 13:43:59 -08:00
Jason Ekstrand	d689745303	vk/0.210.0: Rework device features and limits	2015-12-03 13:43:54 -08:00
Jason Ekstrand	74c4c4acb6	vk/0.210.0: Rework QueueFamilyProperties	2015-12-03 13:43:54 -08:00
Jason Ekstrand	fed3586f34	vk/0.210.0: Rework result and structure type enums By and large, this is just moving enum values around. However, it also removed VK_UNSUPPORTED which we were returning a number of places. Those places now return VK_ERROR_INCOMPATABLE_DRIVER.	2015-12-03 13:43:54 -08:00
Jason Ekstrand	a5f19f64c3	vk/0.210.0: Remove the VkShaderStage enum This made for an unfortunately large amount of work since we were using it fairly heavily internally. However, gl_shader_stage does basically the same things, so it's not too bad.	2015-12-03 13:43:54 -08:00
Jason Ekstrand	e10dc002e9	vk/0.210.0: Remove VkShader	2015-12-03 13:43:54 -08:00
Jason Ekstrand	e6ab06ae7f	vk/0.210.0: Rework memory property flags	2015-12-03 13:43:54 -08:00
Jason Ekstrand	93071482f9	vk/0.210.0: Remove some unused enum values	2015-12-03 13:43:54 -08:00
Jason Ekstrand	b264012fcf	vk/0.210.0: Update VkPipelineStageFlagBits	2015-12-03 13:43:54 -08:00
Jason Ekstrand	1aaf15bf19	vk/0.210.0: Trivial function argument name change	2015-12-03 13:43:53 -08:00
Jason Ekstrand	938a2939c8	vk/0.210.0: We now allocate command buffers; not create them	2015-12-03 13:43:53 -08:00
Jason Ekstrand	5a02441789	vk/0.210.0: Rename a parameter to GetImageSparseMemoryRequirements	2015-12-03 13:43:53 -08:00
Jason Ekstrand	a9fc0ce0e3	vk/0.210.0: Delete three no longer existant entrypoints	2015-12-03 13:43:53 -08:00
Jason Ekstrand	fcfb404a58	vk/0.210.0: Rework allocation to use the new pAllocator's	2015-12-03 13:43:53 -08:00
Jason Ekstrand	d3547e7334	vk/0.210.0: Use VkSampleCountFlagBits for sample counts	2015-12-03 13:43:53 -08:00
Jason Ekstrand	9349625d60	vk/0.210.0: Rework VkInstanceCreateInfo	2015-12-03 13:43:53 -08:00
Jason Ekstrand	c30a021820	vk/0.210.0: More function argument renaming	2015-12-03 13:43:53 -08:00
Jason Ekstrand	b1cd025b88	vk/0.210.0: Replace MemoryInput/OutputFlags with AccessFlags	2015-12-03 13:43:53 -08:00
Jason Ekstrand	43f3e92348	vk/0.210.0: Rework render pass description structures	2015-12-03 13:43:53 -08:00
Jason Ekstrand	299f8f1511	vk/0.210.0: More structure field renaming	2015-12-03 13:43:53 -08:00
Jason Ekstrand	407b8cc5e0	vk/0.210.0: Get rid of VkImageAspect	2015-12-03 13:43:53 -08:00
Jason Ekstrand	3f6abd0161	vk/0.210.0: Rework descriptor sets	2015-12-03 13:43:52 -08:00
Jason Ekstrand	6a6da54ccb	vk/0.210.0: Rename parameters to memory binding/mapping functions	2015-12-03 13:43:52 -08:00
Jason Ekstrand	aadb7dce9b	vk/0.210.0: Update to the new instance/device create structs	2015-12-03 13:43:52 -08:00
Jason Ekstrand	607fe31598	vk/0.210.0: More trivial struct/enum changes	2015-12-03 13:43:52 -08:00
Jason Ekstrand	dde7172a8a	vk/0.210.0: Trivial flag enum updates	2015-12-03 13:43:52 -08:00
Jason Ekstrand	4cf0b57bbf	vk/0.210.0: Rename ChannelFlags to ColorComponentFlags	2015-12-03 13:43:52 -08:00
Jason Ekstrand	7f2284063d	vk/0.210.0: s/raster/rasterization/	2015-12-03 13:43:52 -08:00
Jason Ekstrand	1ab9f843bc	vk/0.210.0: Don't allow chaining of description structs	2015-12-03 13:43:52 -08:00
Jason Ekstrand	17486b8664	vk/0.210.0: More fun with flags fields	2015-12-03 13:43:52 -08:00
Jason Ekstrand	f5ba1f994a	vk/0.210.0: Make pCode a uint32_t pointer	2015-12-03 13:43:52 -08:00
Jason Ekstrand	5f348bd0e5	vk/0.210.0: Rename origin fields of VkViewport	2015-12-03 13:43:52 -08:00
Jason Ekstrand	9fa6e328eb	vk/0.210.0: Move alphaToOne and alphaToCoverate to multisample state	2015-12-03 13:43:52 -08:00
Jason Ekstrand	f97c3b6d58	vk/0.210.0: Add flags fields to various pipeline create structs	2015-12-03 13:43:51 -08:00
Jason Ekstrand	e673d64209	vk/0.210.0: Change field names in vertex input structs	2015-12-03 13:43:51 -08:00
Jason Ekstrand	fd53603e42	vk/0.210.0: Misc. no-op structure changes The only non-trivial change is to sparse resources that we don't handle anyway.	2015-12-03 13:43:51 -08:00
Jason Ekstrand	fe644721aa	vk/0.210.0: Rename property pCount parameters	2015-12-03 13:43:51 -08:00
Jason Ekstrand	e8f2294cd2	vk/0.210.0: Rework sampler filtering and mode enums	2015-12-03 13:43:51 -08:00
Jason Ekstrand	2e10ca5748	vk/0.210.0: Misc. function argument renames	2015-12-03 13:43:51 -08:00
Jason Ekstrand	569f70be56	vk/0.210.0: Rework copy/clear/blit API	2015-12-03 13:43:47 -08:00
Jason Ekstrand	4ab9391fbb	vk/0.210.0: Rework dynamic states	2015-11-30 14:19:41 -08:00
Jason Ekstrand	73ef7d47d2	vk/0.210.0: Rework color blending enums	2015-11-30 13:49:28 -08:00
Jason Ekstrand	2c77b0cd01	gen7/8/cmd_buffer: Inline vk_to_gen_swizzle It's currently unused on IVB so we get compiler warnings.	2015-11-30 13:29:51 -08:00
Jason Ekstrand	9b1cb8fdbc	vk/0.210.0: Rework a few raster/input enums	2015-11-30 13:28:17 -08:00
Jason Ekstrand	a53f23d93f	vk/0.210.0: Rework texture view component mapping	2015-11-30 13:06:12 -08:00
Jason Ekstrand	f1a7c7841f	vk/0.210.0: Switch to the new VKAPI function decorations While we're at it, we do a bunch of the VkResult -> void updates	2015-11-30 12:46:30 -08:00
Jason Ekstrand	a89a485e79	vk/0.210.0: Rename CmdBuffer to CommandBuffer	2015-11-30 11:48:08 -08:00
Jason Ekstrand	6a8a542610	vk/0.210.0: A pile of minor enum updates	2015-11-30 11:12:44 -08:00
Jason Ekstrand	3db43e8f3e	vk/0.210.0: Switch to the new-style handle declarations	2015-11-30 10:58:02 -08:00
Jason Ekstrand	5cb57806b2	vk: Add connonical 0.170.2 and 0.210.0 headers This is in preparation for the API update	2015-11-30 10:24:35 -08:00
Kristian Høgsberg Kristensen	d6d82f1ab3	vk: Fix 3DSTATE_WM_DEPTH_STENCIL for gen8 This packet is a different size on gen8 and we hit an assertion when we try to merge a gen9 size dword array from the pipeline with the gen8 sized array we create from dynamic state. Use a static assert in the merge macro and fix this issue by using different wm_depth_stencil arrays on gen8 and gen9.	2015-11-26 10:11:52 -08:00
Kristian Høgsberg Kristensen	cd4721c062	vk: Add SKL support Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-11-25 22:34:10 -08:00
Kristian Høgsberg Kristensen	c445fa2f77	vk: Make entrypoint generator output gen9 entry points Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-11-25 20:58:25 -08:00
Kristian Høgsberg Kristensen	0e02a88ad4	vk: Add GEN9 pack header	2015-11-25 20:56:41 -08:00
Kristian Høgsberg Kristensen	0c59cb42b5	vk: Move all gen8 files to gen8 lib	2015-11-25 14:13:53 -08:00
Jason Ekstrand	179fc4aae8	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in nir cloning and some much-needed upstream refactors.	2015-11-23 14:03:47 -08:00
Jason Ekstrand	e14b2c76b4	anv/meta_clear: Don't trash state if no clears are needed	2015-11-21 11:39:12 -08:00
Jason Ekstrand	83c305f8ef	anv/meta_clear: Don't try to clear depth-stencil without LOAD_OP_CLEAR	2015-11-21 00:05:18 -08:00
Jason Ekstrand	438eaa3ae7	anv/meta: Add initial support for multi-slice array and 3-D copies We still need to fix up a few bits once we have real CPP values, but this should get us a long ways.	2015-11-20 18:25:06 -08:00
Jason Ekstrand	d6a7c659c7	anv/meta: Use array textures for 2D This a total of 1 extra instruction in the shader and gives us a lot more flexibility in how we do blits.	2015-11-20 16:00:34 -08:00
Jason Ekstrand	e3ec964e44	anv/meta: Keep z coordinate flat while blitting	2015-11-20 15:48:03 -08:00
Jason Ekstrand	1157b0360d	nir/spirv: Rework decoration iteration The old code didn't work correctly if you had member decorations after non-member decorations. Since glslang never gave us any of those, it wasn't properly tested.	2015-11-20 15:15:40 -08:00
Jason Ekstrand	cff74d6fb8	nir/spirv: Handle OpNop	2015-11-20 15:02:45 -08:00
Jason Ekstrand	1d42f773d3	gen8_state: Clamp sampler values to HW limitations	2015-11-20 14:45:44 -08:00
Jason Ekstrand	48228c114e	nir/spirv: Add support for runtime arrays	2015-11-20 12:49:20 -08:00
Jason Ekstrand	55d16c090e	gen8/pipeline: Properly handle MIN/MAX blend ops	2015-11-20 11:53:10 -08:00
Jason Ekstrand	b43ce6768d	gen8/pipeline: Set IndependentAlphaBlendEnable properly	2015-11-20 11:52:54 -08:00
Jason Ekstrand	e69db9159b	gen8/pipeline: Minor blending fixes This makes various fields match upstream mesa	2015-11-20 11:52:30 -08:00
Jason Ekstrand	fa8db0dfcc	anv: Put all of the descriptor set stuff together in one file The stuff to take descriptor sets and turn them into binding tables and sampler tables is still in anv_cmd_buffer.c. We may want to consider putting it in anv_descriptor_set.c eventually.	2015-11-18 14:58:43 -08:00
Jason Ekstrand	828b1a6eb6	anv/device: Update the right sampler in UpdateDescriptorSets	2015-11-18 14:48:28 -08:00
Jason Ekstrand	6f613abc2b	anv/cmd_buffer: Add a new genX_cmd_buffer file for shared code This file contains code that can be shared across gens modulo recompiling. In particular, we can share STATE_BASE_ADDRESS setup and handling of the vkPipelineBarrier call. Not sharing STATE_BASE_ADDRESS setup has already been a source of bugs and the gen7 and gen8 implementations of PipelineBarrier were line-for-line identical. Incidentally, this should fix MOCS settings for dynamic and surface state on Haswell.	2015-11-18 12:26:57 -08:00
Jason Ekstrand	fb8b2f5f9e	anv/gen7: A bunch of depth-stencil fixes There are various bits which move around between Haswell and Ivy Bridge that we weren't taking into account. This also makes us actually set the StencilWriteEnable in a sane way.	2015-11-18 11:43:52 -08:00
Jason Ekstrand	e9d634f4ad	gen7/pipeline: Re-arrange stencil parameters to match gen8	2015-11-17 19:10:31 -08:00
Jason Ekstrand	9e39bdabad	anv/gen7: Implement CmdPipelineBarrier	2015-11-17 17:09:27 -08:00
Jason Ekstrand	b707e90b6e	anv/gen7: Don't use the upper bound on dynamic state base address It doesn't do much for us and, if we have to resize the dynamic state block pool for any reason, it becomes out-of-date.	2015-11-17 17:08:44 -08:00
Jason Ekstrand	f0390bcad6	anv: Add initial Haswell support	2015-11-17 12:14:24 -08:00
Jason Ekstrand	45320f677b	anv: Add macros for doing per-gen compilation	2015-11-17 08:27:51 -08:00
Jason Ekstrand	92d164b1c3	anv/entrypoints: Add dispatch support for haswell	2015-11-17 08:27:51 -08:00
Jason Ekstrand	aa3002bd42	anv/entrypoints: Use devinfo instead of a gen number	2015-11-17 08:27:51 -08:00
Jason Ekstrand	0508046dc8	anv/cmd_buffer: Pack the 3DSTATE_VF packet on-demand	2015-11-17 08:27:51 -08:00
Jason Ekstrand	34d55d69cf	anv/formats: Don't advertise stencil texture/blit prior to Broadwell	2015-11-17 08:23:29 -08:00
Jason Ekstrand	de54b4b18f	anv: Only include the pack headers where needed Previously, we were including gen7_pack.h, gen75_pack.h, and gen8_pack.h in anv_private.h. As we add more gens, this is going to become untenable. This commit moves things around so that we only use the pack headers when and if we need them.	2015-11-16 12:29:09 -08:00
Jason Ekstrand	cb9e2305f8	anv/cmd_buffer: Move gen-specific stuff into the appropreate files	2015-11-16 12:10:11 -08:00
Jason Ekstrand	22d024e031	nir/spirv: Add support for separate samplers and textures This gets tricky in a few places because we have to pass vtn_sampled_image values through OpAccessChain, but it works ok. At some point, it probably needs to be cleaned up but it doesn't occur to me exactly how to do that at the moment. We'll see how this approach goes.	2015-11-14 22:32:54 -08:00
Jason Ekstrand	002db3ee15	anv/cmd_buffer: Add a default descriptor type case This silences a bunch of compiler warnings.	2015-11-14 09:16:55 -08:00
Jason Ekstrand	e9dba80430	anv/apply_pipeline_layout: Handle separate samplers and textures	2015-11-14 09:00:35 -08:00
Jason Ekstrand	b5d4027c35	Merge branch 'wip/i965-separate-sampler-tex' into vulkan	2015-11-14 08:23:27 -08:00
Jason Ekstrand	c7d504ad93	i965/vec4: Plumb separate surfaces and samplers through from NIR	2015-11-14 08:05:31 -08:00
Jason Ekstrand	3dd84822df	i965/vec4: Separate the sampler from the surface in generate_tex	2015-11-14 08:05:31 -08:00
Jason Ekstrand	c09e140b65	i965/fs: Plumb separate surfaces and samplers through from NIR	2015-11-14 08:04:47 -08:00
Jason Ekstrand	c2a373ec85	i965/fs: Separate the sampler from the surface in generate_tex	2015-11-14 08:01:50 -08:00
Jason Ekstrand	b169bb902a	nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the sampler and leaves the texture alone as it did before and nir_lower_samplers assumes this. However, backends can, if they wish, assume that they are separate because nir_lower_samplers sets both texture and sampler index (they are the same in this case).	2015-11-14 07:57:31 -08:00
Jason Ekstrand	1469ccb746	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in Matt's big compiler refactor.	2015-11-14 07:56:10 -08:00
Jason Ekstrand	e8f51fe4de	anv/gen8: Subtract 1 from num_elements when setting up buffer surface state	2015-11-13 22:50:54 -08:00
Jason Ekstrand	91bc4e7cec	anv/pipeline: Don't free blend states that don't exist Compute pipelines don't need a blend state so we shouldn't be unconditionally freeing it.	2015-11-13 21:49:41 -08:00
Jason Ekstrand	c1733886a6	nir/spirv: Add support for SSBO stores This only handles vector stores, not component-of-a-vector stores.	2015-11-13 21:41:52 -08:00
Jason Ekstrand	c68e28d766	nir/spirv: Refactor vtn_block_load We pull the offset calculations out into their own function so we can re-use it for stores.	2015-11-13 21:32:00 -08:00
Jason Ekstrand	99494b96f0	nir/spirv: Add support for image_load_store	2015-11-13 17:54:43 -08:00
Jason Ekstrand	164b3ca164	nir/builder: Add a nir_ssa_undef helper	2015-11-13 17:54:43 -08:00
Jason Ekstrand	ffbc31d13b	nir/spirv: Add support for creating image variables	2015-11-13 17:54:43 -08:00
Jason Ekstrand	453239f6a5	nir/spirv: Add support for image types	2015-11-13 17:54:43 -08:00
Jason Ekstrand	0572444a0e	nir/types: Add image type helpers	2015-11-13 17:54:43 -08:00
Jason Ekstrand	d5ba7a26d9	glsl/types: Add a get_image_instance helper	2015-11-13 17:54:43 -08:00
Chad Versace	738eaa8acf	isl: Embed brw_device_info in isl_device Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 11:14:03 -08:00
Chad Versace	ba467467f4	anv: Use enum isl_tiling everywhere In anv_surface and anv_image_create_info, replace member 'uint8_t tile_mode' with 'enum isl_tiling'. As a nice side-effect, this patch also reduces bug potential because the hardware enum values for tile modes are unstable across hardware generations.	2015-11-13 10:44:09 -08:00
Chad Versace	af392916ff	anv/device: Embed isl_device Embed struct isl_device into anv_physical_device and anv_device. It will later be used for surface layout calculations.	2015-11-13 10:44:09 -08:00
Chad Versace	a4a2ea3f79	isl: Add enum isl_tiling and a query func The query func is isl_tiling_get_extent.	2015-11-13 10:44:07 -08:00
Chad Versace	652727b029	isl: Add structs isl_extent2d, isl_extent3d They are nowhere used yet.	2015-11-13 10:31:49 -08:00
Chad Versace	b1bb270590	isl: Add struct isl_device The struct is incomplete (it contains only the gen). And it's nowhere used yet. It will be used later for surface layout calculations.	2015-11-13 10:31:37 -08:00
Chad Versace	477383e9ac	anv: Strip trailing whitespace from anv_device.c	2015-11-13 10:27:40 -08:00
Chad Versace	c6493dff79	anv: Strip trailing space in anv_private.h	2015-11-12 12:24:01 -08:00
Chad Versace	addc2a9d02	anv: Remove redundant fields anv_format::bs,bw,bh,bd Instead, use the equivalent fields in anv_format::isl_layout.	2015-11-12 12:23:49 -08:00
Chad Versace	cbc31f453d	anv/formats: Re-indent the fmt() macro Use one line per struct member.	2015-11-12 12:21:46 -08:00
Chad Versace	1bea1669c5	anv: Use enum isl_format in anv_format This patch begins using isl.h in Anvil. More refactors will follow. Change type of anv_format::surface_format from uint16_t -> enum isl_format.	2015-11-12 12:21:46 -08:00
Chad Versace	bfb022a235	isl: Generate isl_format_layout.c Generate an array of struct isl_format_layout, using isl_format_layout.csv as input. Each entry follows the patten: [ISL_FORMAT_R32G32B32A32_FLOAT] = { ISL_FORMAT_R32G32B32A32_FLOAT, .bs = 16, .bpb = 128, .bw = 1, .bh = 1, .bd = 1, .channels = { .r = { ISL_SFLOAT, 32 }, .g = { ISL_SFLOAT, 32 }, .b = { ISL_SFLOAT, 32 }, .a = { ISL_SFLOAT, 32 }, .l = {}, .i = {}, .p = {}, }, .colorspace = ISL_COLORSPACE_LINEAR, .txc = ISL_TXC_NONE, },	2015-11-12 12:21:46 -08:00
Chad Versace	7986efc644	isl: Add CSV of format layouts Add file isl_format_layout.csv, which describes the block layout, channel layout, and colorspace of all hardware surface formats.	2015-11-12 11:56:16 -08:00
Chad Versace	67362698a9	isl: Add enum isl_format	2015-11-12 11:34:45 -08:00
Jason Ekstrand	3a3d79b38e	anv/gen7: Implement the VS state depth-stall workaround	2015-11-10 16:42:34 -08:00
Jason Ekstrand	750b8f9e98	anv/gen7: Properly handle a GS with zero invocations	2015-11-10 16:41:23 -08:00
Jason Ekstrand	9d18555c8d	anv/gen7: Add push constant support	2015-11-10 15:14:11 -08:00
Jason Ekstrand	427978d933	anv/device: Use an actual int64_t in WaitForFences	2015-11-10 15:02:52 -08:00
Jason Ekstrand	d9079648d0	anv/meta: Create a sampler in meta_emit_blit	2015-11-10 14:43:18 -08:00
Jason Ekstrand	b461744c52	anv/gen7: Properly handle VS with VertexID but no vertices	2015-11-10 11:31:31 -08:00
Jason Ekstrand	aafc87402d	anv/device: Work around the i915 kernel driver timeout bug There is a bug in some versions of the i915 kernel driver where it will return immediately if the timeout is negative (it's supposed to wait indefinitely). We've worked around this in mesa for a few months but never implemented the work-around in the Vulkan driver. I rediscovered this bug again while working on Ivy Bridge becasuse the drive in my Ivy Bridge currently has Fedora 21 installed which has one of the offending kernels.	2015-11-10 11:24:11 -08:00
Jason Ekstrand	06f466a770	anv/nir: Fix codegen in lower_push_constants	2015-11-09 16:29:05 -08:00
Jason Ekstrand	abede04314	anv/gen7: Fix the length of 3DSTATE_SF	2015-11-09 16:04:07 -08:00
Jason Ekstrand	e8c2a52a70	anv/gen7: Properly handle missing color-blend state	2015-11-09 16:04:06 -08:00
Jason Ekstrand	862da6a891	anv/device: Add a newline to the end of a comment	2015-11-09 16:04:06 -08:00
Nanley Chery	9c2b37a9c3	anv/formats: Define ETC2 formats Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Nanley Chery	41cf35d1d8	anv/image: Determine the alignment units for compressed formats Alignment units, i and j, match the compressed format block width and height respectively. v2: Don't assert against HALIGN* and VALIGN* enums (Chad) Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Nanley Chery	381f602c6b	anv/image: Handle compressed format qpitch and padding Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Nanley Chery	300f7c2be3	anv/image: Handle compressed format stride and size These formulas did not take compressed formats into account. Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Nanley Chery	7b4244dea0	anv/formats: Add fields for block dimensions A non-compressed texture is a 1x1x1 block. Compressed textures could have values which vary in different dimensions WxHxD. Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Nanley Chery	a6c7d1e016	anv/formats: Add surface_format initializer v2: Rename __brw_fmt to __hw_fmt (Chad) Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chad Versace chad.versace@intel.com	2015-11-09 15:41:41 -08:00
Nanley Chery	3ee923f1c2	anv: Rename cpp variable to "bs" cpp (chars-per-pixel) is an integer that fails to give useful data about most compressed formats. Instead, rename it to "bs" which stands for block size (in bytes). v2: Rename vk_format_for_bs to vk_format_for_size (Chad) Use "block size" instead of "bs" in error message (Chad) Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Jason Ekstrand	17fa3d3572	nir/spirv: Give both block and buffer_block types an interface type	2015-11-07 08:03:25 -08:00
Jason Ekstrand	a10d59c09a	nir/spirv: Increment num_ubos/ssbos when creating variables	2015-11-06 16:53:27 -08:00
Jason Ekstrand	046563167c	anv/apply_dynamic_offsets: Use the right sized immediate zero	2015-11-06 16:49:24 -08:00
Jason Ekstrand	104525c33b	anv/pipeline: Set the right SSBO binding table start index for FS	2015-11-06 15:57:51 -08:00
Jason Ekstrand	399d5314f6	anv/cmd_buffer: Rework the way we emit UBO surface state The new mechanism should be able to handle SSBOs as well as properly handle emitting surface state on gen7 where we need different strides depending on shader stage.	2015-11-06 15:14:12 -08:00
Jason Ekstrand	1b5c7e7ecd	anv/pipeline: Expose is_scalar_shader_stage	2015-11-06 15:12:33 -08:00
Jason Ekstrand	5ba281e794	nir/spirv: Add a helper for determining if a block is externally visable	2015-11-06 15:09:57 -08:00
Jason Ekstrand	220261a0c9	anv: Use VkDescriptorType instead of anv_descriptor_type	2015-11-06 14:09:52 -08:00
Jason Ekstrand	612e35b2c6	anv: Do range-checking in the shader for dynamic buffers	2015-11-06 13:32:52 -08:00
Jason Ekstrand	f8052351ac	anv/device: Increase the block size for instructions	2015-11-06 13:29:47 -08:00
Jason Ekstrand	d7cc9929bb	anv: Remove all support for BufferViews We never actually supported them, we just used them for binding UBOs. Now that we have BufferInfo and we aren't supporting texture buffers yet, we should get rid of them until we can do them properly.	2015-11-06 13:16:18 -08:00
Jason Ekstrand	0360c3608b	anv/device: Only support binding UBOs through BufferInfo	2015-11-06 12:52:12 -08:00
Jason Ekstrand	3aa2fc82dd	anv: Rework UpdateDescriptorSets Previously, UpdateDescriptorSets was wrong because it assumed that the binding was the offset into the descriptor set.	2015-11-06 12:28:03 -08:00
Jason Ekstrand	45b1bbe801	anv: Add a descriptor_index to anv_descriptor_set_binding_layout	2015-11-06 12:16:54 -08:00
Jason Ekstrand	f029e0ce13	anv: Add a layout to anv_descriptor_set	2015-11-06 12:16:54 -08:00
Chad Versace	16119ad884	anv/meta: Finish load clears for stencil attachments Tested by Crucible "func.depthstencil.stencil_triangles.*" in commit c194292d5eadb84e9d7489fc01ce0b653cdd4ca5 (HEAD -> master) Author: Chad Versace <chad.versace@intel.com> Date: Wed Nov 4 16:19:24 2015 -0800 Subject: func.depthstencil: Remove stencil clear workaround for Mesa	2015-11-05 15:45:43 -08:00
Jason Ekstrand	a40f682c71	anv/cmd_buffer: Fix SURFACE_STATE for non-view buffer bindings We were treating it as if it's a BufferView and weren't taking the offset into account properly.	2015-11-04 19:56:18 -08:00
Jason Ekstrand	1b68120760	anv/cmd_buffer: Don't use an anv_state pointer in emit_binding_table The anv_state is supposed to be a flyweight so we're not really saving anything by using a pointer. Also, we were creating one, setting a pointer to it, and then having it go out-of-scope which is bad.	2015-11-04 19:56:16 -08:00
Chad Versace	d259af3fbb	anv: Remove unused anv_render_pass members Remove members num_color_clear_attachments has_depth_clear_attachment has_stencil_clear_attachment The new clear code in anv_meta_clear.c does not use them.	2015-11-04 15:54:38 -08:00
Chad Versace	a9a3071fc4	anv/meta: Rewrite clear code Fixes Crucible test "func.clear.load-clear.attachments-8". The old clear code, when clearing attachments for VK_ATTACHMENT_LOAD_OP_CLEAR, suffered from some fundamental bugs. The bugs were not fixable with the old code's approach. - It assumed that a VkRenderPass contained at most one depthstencil attachment. - It tried to clear all attachments (color and the sole depthstencil) with a single instanced draw call, using the VUE header's RenderTargetArrayIndex to specify the instance's target color attachment. But the RenderTargetArrayIndex does not select entries in the binding table; it only selects an array index of a singled layered surface. - If at least one attachment of VkRenderPass had VK_ATTACHMENT_LOAD_OP_CLEAR, then the old code cleared all attachments. This was a consequence of using a single draw call and single pipeline for the clear. The new clear code fixes those bugs by making a separate draw call for each attachment, and using one pipeline when clearing color attachments and a different pipeline for depth attachments. The new code, like the old code, does not clear stencil attachments. It is left as a FINISHME.	2015-11-04 15:20:52 -08:00
Chad Versace	49c96a14c5	anv/meta: Clear color attribute is always flat No behavioral change. This patch just removes an unneeded function parameter.	2015-11-04 15:15:19 -08:00
Chad Versace	7f82cc718f	anv/meta: Use consistent naming for dynamic state mask Consistently rename bitmasks of Vulkan dynamic state to 'dynamic_mask'. anv_meta_saved_state::dynamic_flags -> dynamic_mask anv_meta_save(dynamic_state) -> dynamic_mask	2015-11-04 15:15:19 -08:00
Chad Versace	2bdb9e2ed9	anv/meta: Rename anv_cmd_buffer_save/restore As the functions are now exposed in anv_meta.h, let's rename them to clarify that they are meta functions. anv_cmd_buffer_save -> anv_meta_save anv_cmd_buffer_restore -> anv_meta_restore	2015-11-04 15:15:19 -08:00
Chad Versace	16b2a489db	anv: Move meta clear code to new file anv_meta_clear.c anv_meta.c currently handles blits, copies, clears, and resolves. The clear code is about to grow, and anv_meta.c is already busting at the seams.	2015-11-04 15:15:19 -08:00
Chad Versace	c56727037a	anv: Move struct anv_vue_header to anv_private.h Move it from anv_meta.c to the common header anv_private.h. This allows us to split the meta blit and meta clear code into separate files.	2015-11-04 15:15:19 -08:00
Jason Ekstrand	b00e3f221b	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-11-03 15:45:04 -08:00
Jason Ekstrand	a1e7b8701a	nir: remove sampler_set from nir_tex_instr Now that descriptor sets are handled in a lowering pass, this is no longer needed.	2015-11-03 14:58:20 -08:00
Chad Versace	4d1c76485b	anv: Drop stale comment in anv_cmd_buffer_emit_binding_table() When emitting the binding table for the fragment shader stage, we no longer "walk all of the attachments, [inserting only] the color attachments into the binding table". Instead, we iterate only over the subpass's color attachments, which is the minimal possible iteration. While killing the comment, also rename the variable 'attachments' to 'color_count', as it's no longer a count of all framebuffer attachments but only the subpass's color attachment count.	2015-11-03 13:46:40 -08:00
Jason Ekstrand	584f9d4442	anv: Report 0 physical devices when not on Broadwell or Ivy Bridge Right now, Broadweel and Ivy Bridge are the only supported platforms. Hopefully, this reduces the chances that someone will try the driver on unsupported hardware and be confused that it doesn't work.	2015-11-02 12:14:37 -08:00
Jason Ekstrand	3883728730	anv: Add better push constant support What we had before was kind of a hack where we made certain untrue assumptions about the incoming data. This new support, while it still doesn't support indirects properly (that will come), at least pulls the offsets and strides from SPIR-V like it's supposed to.	2015-10-29 22:26:36 -07:00
Jason Ekstrand	1f2624e6dd	nir/spirv: Add support for push constants	2015-10-29 22:26:00 -07:00
Jason Ekstrand	a2283508b0	nir/intrinsics: Add a load_push_constant intrinsic	2015-10-29 22:26:00 -07:00
Jason Ekstrand	f2a8c9db24	nir/spirv: Rework the way we handle interface types	2015-10-29 22:26:00 -07:00
Chad Versace	4073219cf1	anv/pass: Remove redundant assert Trivial fix.	2015-10-29 11:47:39 -07:00
Chad Versace	1e98177439	anv/pass: Move VkRenderPass code to new file Move it from anv_device.c to new file anv_pass.c. Because it will soon grow bigger.	2015-10-29 11:10:03 -07:00
Chad Versace	c284c39b13	anv: Fix parsing of load ops in VkAttachmentDescription My original understanding of VkAttachmentDescription::loadOp, stencilLoadOp was incorrect. Below are all possible combinations: VkFormat \| loadOp=clear stencilLoadOp=clear ---------------+--------------------------- color \| clear-color ignored depth-only \| clear-depth ignored stencil-only \| ignored clear-stencil depth-stencil \| clear-depth clear-stencil	2015-10-29 10:59:55 -07:00
Jason Ekstrand	8bcba083db	anv: Update the README Adds a note that we support SPIR-V revision 32. Also, we now support geometry shaders.	2015-10-28 12:30:34 -07:00
Jason Ekstrand	12feda0c09	Revert "nir/intrinsic: Allow up to four indices" This reverts commit `5eccd0b4b9`. This was only needed for the store_ssbo_vk_indirect intrinsic	2015-10-27 13:44:14 -07:00
Jason Ekstrand	423e7a55cc	Revert "nir/intrinsics: Add new Vulkan load/store intrinsics" This reverts commit `24bcc89c8f`. Now that we have the new vulkan_resource_index intrinsic, these variants of the classic UBO/SSBO instrinsics aren't needed.	2015-10-27 13:43:25 -07:00
Jason Ekstrand	a6be53223e	anv/nir: Work with the new vulkan_resource_index intrinsic	2015-10-27 13:42:51 -07:00
Jason Ekstrand	3d44b3aaa6	nir/spirv: Use the new vulkan_resource_index intrinsic This is instead of using the _vk versions of UBO/SSBO load/store intrinsics	2015-10-27 13:41:59 -07:00
Jason Ekstrand	800a9706f0	nir: Add a vulkan_resource_index intrinsic	2015-10-27 13:41:08 -07:00
Jason Ekstrand	37b6afb3d9	Add a todo comment about intput_slots_valid in the FS shader key	2015-10-26 16:25:02 -07:00
Jason Ekstrand	ab6ed2e1ac	anv/gen8_pipeline: Emit a real 3DSTATE_SBE_SWIZ packet	2015-10-26 16:25:02 -07:00
Jason Ekstrand	9006e555ce	anv/pipeline: Bump the size of the pipeline batch to accomodate GS The 1k batch size wasn't big enough for a full pipeline setup including geometry shaders. Some day we should make it dynamic.	2015-10-23 16:50:31 -07:00
Jason Ekstrand	4c59ee808f	anv/gen8_pipeline: Various 3DSTATE_GS fixes	2015-10-23 16:49:26 -07:00
Jason Ekstrand	8aba8cf513	anv/pipeline: Use separate-shader	2015-10-23 10:53:00 -07:00
Jason Ekstrand	760c4b894d	anv/pipeline: Pull separate_shader from NIR for vue map setup	2015-10-23 10:48:52 -07:00
Jason Ekstrand	ee8c67abe8	nir/spirv: Add support for builtins in arrays	2015-10-22 17:58:20 -07:00
Jason Ekstrand	9fe907ec79	nir/spirv: Make the builtins array distinguish between in and out	2015-10-22 17:54:24 -07:00
Jason Ekstrand	d11ea76168	nir/spirv: Make vtn_get_builtin_location smarter Instead of just stomping on the mode, it now validates asserts that the previously set mode is correct and only changes it if needed. We need to do this because, in geometry shaders, there are some builtins that can be either an input or an output depending on context. We can get that information from the SPIR-V source but we can't throw it away.	2015-10-22 17:45:41 -07:00
Jason Ekstrand	9abef3e817	nir/spirv: Make get_builtin_variable take a nir_variable_mode We'll want this in a moment for validation but, for now, it just gets stompped by get_builtin_variable.	2015-10-22 17:28:25 -07:00
Jason Ekstrand	2ce6636c75	nir/spirv: Remove the vtn_type argument from _vtn_variable_load/store Now that builtins are handled in deref chains, we don't really need this anymore.	2015-10-22 16:56:42 -07:00
Jason Ekstrand	f23d951083	nir/validate: Add better validation of load/store types	2015-10-22 16:53:01 -07:00
Jason Ekstrand	82c579e314	anv/gen8: Set the correct maximum number of GS threads This equation was pulled from mesa gen8_gs_state.c	2015-10-21 21:51:18 -07:00
Jason Ekstrand	d0e8c78407	anv/pipeline: set the gs_vertex_count in compile_gs This was missed in the initial enabling commit.	2015-10-21 21:50:47 -07:00
Jason Ekstrand	8af2a09956	anv/pipeline: Make the has_push_constants computation more accurate The computation used to only look for uniforms that weren't samplers. Now it also filters out arrays of samplers.	2015-10-21 21:50:16 -07:00
Jason Ekstrand	0329a252bd	nir/spirv: Add defaults for GS input/output primitive types These are supposed to be specified in the SPIR-V source as SpvExecutionMode enums but glslang isn't giving them to us. A bug has been filed: https://github.com/KhronosGroup/glslang/issues/84	2015-10-21 21:46:22 -07:00
Jason Ekstrand	4032549885	i965/vec4: Handle returns at the end of functions	2015-10-21 20:42:23 -07:00
Jason Ekstrand	5f29dacda2	i965: Move get_hw_prim_for_gl_prim to brw_util.c	2015-10-21 20:40:28 -07:00
Jason Ekstrand	ea23cb3543	nir/spirv: Add capabilities and decorations for basic geometry shaders	2015-10-21 20:36:25 -07:00
Jason Ekstrand	d538fe849d	anv/pipeline: Add back basic geometry shader support Now that we've done the refactoring upstream, it's much easier to to get hooked up. We haven't tested things well enough to know that we're setting up the GPU state correctly for them yet but at least we can compile them now.	2015-10-21 18:45:48 -07:00
Jason Ekstrand	164abff0c0	nir/spirv: Add support for more CS system values	2015-10-21 18:39:06 -07:00
Jason Ekstrand	5790ee2bbb	nir/spirv: Add support for various barrier type instructions	2015-10-21 18:17:11 -07:00
Jason Ekstrand	3d35e4361f	Fix a couple of dereferences	2015-10-21 18:16:50 -07:00
Jason Ekstrand	55a7ee730c	spirv/nir: Add more stage asserts	2015-10-21 18:00:05 -07:00
Jason Ekstrand	27393c8630	nir/spirv: Add support for GS metadata	2015-10-21 17:58:34 -07:00
Jason Ekstrand	a8ffd6e72c	nir/gather_info: Add more info for geometry shaders	2015-10-21 17:42:47 -07:00
Jason Ekstrand	fed60e3c73	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-10-21 17:40:13 -07:00
Chad Versace	0ab926dfbf	anv: Don't teardown uninitialized anv_physical_device If the user called vkDestroyDevice but never called vkEnumeratePhysicalDevices, then the driver tried to ralloc_free() an unitialized anv_physical_device. Fixes test 'dEQP-VK.api.device_init.create_instance_name_version'.	2015-10-21 11:55:37 -07:00
Jason Ekstrand	c8572d0f9c	anv/pipeline: Remove a redundant line We set compute_sample_id based on multisample state two lines below.	2015-10-20 16:02:03 -07:00
Jason Ekstrand	72d99f8a40	anv/pipeline: Update a comment	2015-10-20 16:00:55 -07:00
Jason Ekstrand	27d868500a	anv/pipeline: Set key->render_to_fbo to false for fragment shaaders Vulkan uses the upper-left convention. This is the same as DX one and what our hardware does. We had it flipped around.	2015-10-20 15:37:16 -07:00
Jason Ekstrand	59bae36ffb	nir/spirv: Fix a typo	2015-10-20 15:35:13 -07:00
Jason Ekstrand	44b22ca441	nir/spirv: Handle SpvExecutionMode	2015-10-20 15:23:56 -07:00
Jason Ekstrand	a71e614d33	anv: Completely rework shader compilation Now that we have a decent interface in upstream mesa, we can get rid of all our hacks. As of this commit, we no longer use any fake GL state objects and all of shader compilation is moved into anv_pipeline.c. This should make way for actually implementing a shader cache one of these days. As a nice side-benifit, this commit also gains us an extra 300 passing CTS tests because we're actually filling out the texture swizzle information for vertex shaders.	2015-10-20 13:02:03 -07:00
Jason Ekstrand	2d9e899e35	nir: Add a pass to gather info from the shader This pass fills out a bunch of the fields in nir_shader_info by inspecting the shader.	2015-10-20 13:02:03 -07:00
Jason Ekstrand	6fb4469588	anv: Move the brw_compiler from anv_compiler to physical_device	2015-10-20 13:02:03 -07:00
Jason Ekstrand	9e3615cc7d	i965: Move brw_compiler_create to brw_compiler.h	2015-10-20 13:02:03 -07:00
Jason Ekstrand	bf6407079b	i965: Split process_nir into two haves; pre- and post-	2015-10-20 13:02:03 -07:00
Jason Ekstrand	611ace6861	anv/compiler: Remove more pre-SNB shader key setup	2015-10-20 13:02:03 -07:00
Jason Ekstrand	b3a344db30	anv/compiler: Get rid of GS support. The geometry shader support is currently completely untested. As I go through and re-factor the compiler, I'd rather not refactor dead code that I don't have a way to know if I broke. Let's just remove it for now. We can put it back in easily enough later and then we'll do it properly.	2015-10-20 13:02:03 -07:00
Jason Ekstrand	5f5224f256	anv/meta: Use the actual render pass for creating blit pipelines	2015-10-20 13:02:02 -07:00
Chad Versace	4d4e559b6a	vk: Use consistent names for anv_cmd_state dirty bits Prefix all anv_cmd_state dirty bit tokens with ANV_CMD_DIRTY. For example: old -> new ANV_DYNAMIC_VIEWPORT_DIRTY -> ANV_CMD_DIRTY_DYNAMIC_VIEWPORT ANV_CMD_BUFFER_PIPELINE_DIRTY -> ANV_CMD_DIRTY_PIPELINE Change type of anv_cmd_state::dirty and ::compute_dirty from uint32_t to the self-documenting type anv_cmd_dirty_mask_t.	2015-10-20 11:40:24 -07:00
Chad Versace	2484d1a01f	anv/pipeline: Fix requirement for depthstencil state The Vulkan spec allows VkGraphicsPipelineCreateInfo::pDepthStencilState to be NULL when the pipeline's subpass contains no depthstencil attachment (see spec quote below). anv_pipeline_init_dynamic_state() required it unconditionally. This path fixes anv_pipeline_init_dynamic_state() to access pDepthStencilState only when there is a depthstencil attachment. From the Vulkan spec (20 Oct 2015, git-aa308cb) pDepthStencilState [...] may only be NULL if renderPass and subpass specify a subpass that has no depth/stencil attachment.	2015-10-20 11:29:16 -07:00
Chad Versace	b51468b519	anv/pipeline: Validate VkGraphicsPipelineCreateInfo The Vulkan spec (20 Oct 2015, git-aa308cb) states that some fields of VkGraphicsPipelineCreateInfo are required under certain conditions. Add a new function, anv_pipeline_validate_create_info() that asserts the requirements hold. The assertions helped me discover bugs in Crucible and anv_meta.c.	2015-10-20 10:55:54 -07:00
Chad Versace	855180b3d9	anv: Define anv_validate macro If a block of code is annotated with anv_validate, then the block runs only in debug builds.	2015-10-20 10:55:54 -07:00
Chad Versace	81f8b82fc8	vk/meta: Add required renderpass to pipeline The Vulkan spec (20 Oct 2015, git-aa308cb) requires that VkGraphicsPipelineCreateInfo::renderPass be a valid handle. To satisfy that, define a static dummy render pass used for all meta operations.	2015-10-20 10:48:26 -07:00
Chad Versace	0d84a0d58b	vk/meta: Add required multisample state to pipeline The Vulkan spec (20 Oct 2015, git-aa308cb) requires that VkGraphicsPipelineCreateInfo::pMultisampleState not be NULL.	2015-10-20 10:48:09 -07:00
Jason Ekstrand	60e8439237	anv/compiler: Remove irrelevant wm key setup Most of this applies to Iron Lake and prior only. While we're at it, we get rid of the legacy GL shading model code.	2015-10-19 17:00:26 -07:00
Jason Ekstrand	27ca9ca4e1	anv/compiler: Get rid of legacy shader key setup Most of the shader key setup we did was for pre-Sandybridge and the stuff for SNB+ wasn't in the key setup. That stuff still isn't there but at least we've left ourselves notes for now.	2015-10-19 16:45:11 -07:00
Jason Ekstrand	661d0db077	anv/compiler: Delete legacy clipping code This is a Vulkan driver. We don't need legacy clipping stuff and, even if we did, we don't plan on supporting pre-Sandybridge anyway.	2015-10-19 16:26:16 -07:00
Jason Ekstrand	fba55b711e	anv/compiler: Remove unneeded wm prog data setup As of upstream mesa changes, brw_compile_fs does this for us so there's no need to have the code in the Vulkan driver anymore.	2015-10-19 16:17:41 -07:00
Jason Ekstrand	12c30c9498	nir/spirv: Use the new nir_variable helpers	2015-10-19 16:08:23 -07:00
Jason Ekstrand	7e6959402d	nir/spirv: Handle builtins in OpAccessChain Previously, we were trying to handle them later when loading. However, at that point, you've already lost information and it's harder to handle certain corner-cases. In particular, if you have a shader that does gl_PerVertex.gl_Position.x = foo we have trouble because we see the .x and we don't know that we're in gl_Position. If we, instead, handle it in OpAccessChain, we have all the information we need and we can silently re-direct it to the appropreate variable. This also lets us delete some code which is a nice side-effect.	2015-10-19 15:50:45 -07:00
Jason Ekstrand	958fc04dc5	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-10-19 14:14:21 -07:00
Jason Ekstrand	995d9c4ac7	anv/pipeline: Remove the ViewportState finishme We should be doing everything we need to with the viewport state	2015-10-17 10:35:29 -07:00
Jason Ekstrand	3e47e34036	anv: Add support for immutable descriptors	2015-10-17 08:17:00 -07:00
Jason Ekstrand	7010fe61c8	anv: Add facilities for dumping an image to a file The ability to dump an arbitrary miplevel or array slice of an anv_image to a file is very useful for debugging. Nothing inside of the driver calls this right now, but it's very useful to call from GDB.	2015-10-16 20:03:06 -07:00
Jason Ekstrand	368e703a01	anv/pipeline: Rework dynamic state handling Aparently, we had the dynamic state array in the pipeline backwards. Instead of enabling the bits in the pipeline, it disables them and marks them as "dynamic".	2015-10-16 16:30:02 -07:00
Jason Ekstrand	8ed23654c9	nir/spirv: Fix handling of vector component selects via OpAccessChain When we get to the end of the _vtn_load/store_varaible recursion, we may have one link left in the deref chain if there is a vector component select on the end. In this case, we need to truncate the deref chain early so that, when we make the copy for the load, we don't get the extra deref. The final deref will be handled by the vector extract/insert that comes later.	2015-10-15 21:18:57 -07:00
Jason Ekstrand	2552df41a1	anv/cmd_buffer: Reset the command buffer in BeginCommandBuffer	2015-10-15 18:28:00 -07:00
Jason Ekstrand	298d031642	anv/batch_chain: Add some sanity-check asserts for relocations	2015-10-15 17:24:32 -07:00
Jason Ekstrand	3130851add	anv/x11: Only advertise VK_FORMAT_B8R8G8A8_UNORM The others don't work at the moment so we shouldn't be advertising them.	2015-10-15 16:16:17 -07:00
Jason Ekstrand	f5eec407ea	anv/x11: Treat the pPlatformWindow as a xcb_window_t* instead of xcb_window_t	2015-10-15 15:38:20 -07:00
Jason Ekstrand	03952b1513	anv/device: Add support for combined image and sampler descriptors	2015-10-15 15:17:27 -07:00
Jason Ekstrand	b459b3d82c	anv/device: Remove some unneeded anv_finishmes	2015-10-15 15:17:07 -07:00
Jason Ekstrand	ba20569626	anv/device: Make the CreateSemaphore stub return success	2015-10-15 14:34:07 -07:00
Jason Ekstrand	bed7d1e03c	anv: Add support for BufferInfo in descriptor sets	2015-10-15 13:45:53 -07:00
Jason Ekstrand	6dc4cad994	anv/cmd_buffer: Add an alloc_surface_state helper	2015-10-15 13:45:07 -07:00
Jason Ekstrand	896c1c65d6	anv: Get rid of the descriptor_set_binding struct We no longer need it as we have a better way to deal with dynamic offsets.	2015-10-14 19:02:29 -07:00
Jason Ekstrand	42683e3757	anv: Get rid of backend compiler hacks for descriptor sets Now that we have anv_nir_apply_pipeline_layout, we can hand the backend compiler intrinsics and texture instructions that use a flat buffer index just like it wants. There's no longer any reason for any of these hacks.	2015-10-14 18:38:33 -07:00
Jason Ekstrand	da994f4b7e	anv/nir: Rewrite apply_dynamic_offsets to handle the new vk intrinsics	2015-10-14 18:38:33 -07:00
Jason Ekstrand	9c9b7d79c8	anv/nir: Add a pass for applying a applying a pipeline layout to a shader This new pass lowers the _vk intrinsics which take a (set, binding, index) tripple to the single-index non-vk intrinsics based on the pipeline layout.	2015-10-14 18:38:33 -07:00
Jason Ekstrand	de608153fb	nir/spirv: Use the Vulkan ubo intrinsics	2015-10-14 18:38:33 -07:00
Jason Ekstrand	24bcc89c8f	nir/intrinsics: Add new Vulkan load/store intrinsics	2015-10-14 18:38:33 -07:00
Jason Ekstrand	5eccd0b4b9	nir/intrinsic: Allow up to four indices	2015-10-14 18:38:33 -07:00
Jason Ekstrand	b37c38c1ca	anv: Completely rework descriptor set layouts This patch reworks a bunch of stuff in the way we do descriptor set layouts. Our previous approach had a couple of problems. First, it was based on a misunderstanding of arrays in descriptor sets. Second, it didn't properly handle descriptor sets where some bindings were missing stages. The new apporach should be correct and also makes some operations, particularly those on the hot-path, a bit easier. We use the descriptor set layout for four things: 1) To determine the map from bindings to the actual flattened descriptor set in vkUpdateDescriptorSets(). 2) To determine the descriptor <-> binding table entry mapping to use in anv_cmd_buffer_flush_descriptor_sets(). 3) To determine the mappings of dynamic indices. 4) To determine the (set, binding, array index) -> binding table entry mapping inside of shaders. The new approach is directly taylored towards these operations.	2015-10-14 18:38:33 -07:00
Chad Versace	7965fe7da6	vk: Add README Requested by developers outside Intel. During the driver's pre-release development, let's make the README easy to find for external experimenters. Keep it at the top of the source tree.	2015-10-14 13:58:29 -07:00
Jason Ekstrand	d2d8945eb8	nir/spirv: Fix a bug in indirect OpAccessChain handling	2015-10-13 20:00:18 -07:00
Jason Ekstrand	db5a5fcd18	anv/image: Add a basic implementation of GetImageSubresourceLayout	2015-10-13 20:00:17 -07:00
Jason Ekstrand	28ed02588a	anv/formats: Use the surface_format_info struct from brw_surface_formats.h The surface_format_info struct changed in mesa but the copied-and-pasted version didn't get updated on the last mesa master merge. This both fixes the bug and should prevent this in the future.	2015-10-13 15:23:24 -07:00
Jason Ekstrand	accbf178eb	i965/surface_formats: Pull the surface_format_info struct into a header	2015-10-13 15:23:24 -07:00
Jason Ekstrand	fd2ec1c8ad	anv/x11: Do something sensible if get_geometry fails in GetSurfaceProperties	2015-10-13 15:10:40 -07:00
Jason Ekstrand	c31f926726	anv/wsi: Add the GetSurfacePresentModesKHR stub Support has existed in the X11 and Wayland backends for a while but, somehow, the entrypoint got missed in the API shuffle.	2015-10-13 11:47:03 -07:00
Jason Ekstrand	e21ecb841c	anv: Declare/validate the correct API version	2015-10-12 18:25:19 -07:00
Jason Ekstrand	0689a0f0f3	anv/device: Return VK_SUCCESS after setting pCount in QueueFamilyProperties	2015-10-10 15:25:08 -07:00
Kristian Høgsberg Kristensen	fc2a66cfcd	Merge ../mesa into vulkan	2015-10-08 17:20:24 -07:00
Jason Ekstrand	48a87f4ba0	anv/queue: Get rid of the serial This was a remnant of the object tagging implementation we had at one point. We haven't used it for a long time so there's no good reason to keep it around.	2015-10-08 12:16:00 -07:00
Jason Ekstrand	8984559892	vk/0.170.2: Update to the new VK_EXT_KHR_swapchain extensions	2015-10-08 12:11:18 -07:00
Chad Versace	2228ec0112	Merge branch 'vulkan-0.170.2' into vulkan This updates the API from 0.138.2 to 0.170.2, and updates SPIR-V to v32.	2015-10-07 11:49:07 -07:00
Chad Versace	7fa98ab182	vk: Remove temporary vulkan headers Remove vulkan-0.138.2.h and vulkan-0.170.2.h. Their purpose was to aid the header update to 0.170.2.	2015-10-07 11:45:48 -07:00
Chad Versace	2f1ca71360	vk/0.170.2: Bump header version The header is now fully updated.	2015-10-07 11:44:44 -07:00
Chad Versace	c2f94e3a0d	vk/0.170.2: Update C++ errata and typedefs	2015-10-07 11:44:33 -07:00
Chad Versace	0ca3c8480d	vk/0.170.2: Update remaining enums	2015-10-07 11:39:49 -07:00
Chad Versace	f9c948ed00	vk/0.170.2: Update VkResult Version 0.170.2 removes most of the error enums. In many cases, I had to replace an error with a less accurate (or even incorrect) one. In other cases, the error path is replaced with an assertion.	2015-10-07 11:36:51 -07:00
Chad Versace	8dee32e71f	vk/0.170: Update VkDescriptorInfo Ignore the new bufferInfo field with a anv_finishme.	2015-10-07 10:58:55 -07:00
Chad Versace	92e7bd3610	vk/0.170.2: Update vkCreateDescriptorPool Nothing to do. In Mesa the pool is a stub.	2015-10-07 10:47:55 -07:00
Chad Versace	a3bc07c23b	vk/0.170.2: Update VkAttachmentDescription	2015-10-07 10:44:40 -07:00
Chad Versace	82259f88dd	vk/0.170.2: Update VkImageViewCreateInfo	2015-10-07 10:43:44 -07:00
Chad Versace	f4295b3cca	vk/0.170.2: Update VkImageCreateInfo	2015-10-07 10:43:17 -07:00
Chad Versace	d48e71ce55	vk/0.170.2: Update VkPhysicalDeviceProperties	2015-10-07 10:36:46 -07:00
Chad Versace	81e1dcc42c	vk/0.170.2: Update VkImageFormatProperties	2015-10-07 10:28:30 -07:00
Chad Versace	98c2bb6917	vk/0.170.2: Update VkFormatProperties	2015-10-07 10:15:59 -07:00
Chad Versace	545f5cc6e1	vk/0.170.2: Update VkPhysicalDeviceFeatures	2015-10-07 10:09:39 -07:00
Chad Versace	033a37f591	vk/0.170.2: Update VkPhysicalDeviceLimits	2015-10-07 10:09:31 -07:00
Jason Ekstrand	982466aeff	anv/device: Remove some #ifdef'd out code This was a left-over from the dynamic state update.	2015-10-07 09:45:49 -07:00
Jason Ekstrand	010c6efd65	vk/0.170.2: Make vkUpdateDescriptorSets return void	2015-10-07 09:44:53 -07:00
Jason Ekstrand	1a52bc3039	anv/pipeline: Add support for dynamic state in pipelines	2015-10-07 09:40:49 -07:00
Jason Ekstrand	daf68a9465	vk/0.170.2: Switch to the new dynamic state model	2015-10-07 09:40:49 -07:00
Jason Ekstrand	55fcca306b	anv: Add a dynamic state data structure and basic helpers	2015-10-07 09:36:27 -07:00
Jason Ekstrand	941a105954	anv/private: Add a typed_memcpy macro This is amazingly helpful when copying arrays of things around.	2015-10-07 09:36:27 -07:00
Chad Versace	b1c024a932	vk/meta: Fix -Wstrict-prototypes In C, functions with no arguments require a void argument. build_nir_clear_fragment_shader() lacked that. Fixes: anv_meta.c:70:1: warning: function declaration isn't a prototype [-Wstrict-prototypes]	2015-10-07 09:10:25 -07:00
Chad Versace	6dea1a9ba1	vk/0.170.2: Merge VkAttachmentView into VkImageView	2015-10-07 09:10:25 -07:00
Chad Versace	03dd72279f	vk/image: Fix retrieval of anv_surface for depthstencil aspect If anv_image_get_surface_for_aspect_mask() is given a combined depthstencil aspect mask, and the image has a stencil surface but no depth surface, then return the stencil surface. Hacks on hacks.	2015-10-07 09:10:25 -07:00
Chad Versace	85ff3cfde3	vk: Drop -Wextra Eliminates lots of warnings due to anv_meta.c's inclusion of nir.h. I like the extra warnings, and they should probably get fixed. However, git-grep reveals that no other Mesa directory uses -Wextra. Building Vulkan produces a lot of compiler warnings from core Mesa headers that no other Mesa developer sees, and hence no other Mesa developer will fix.	2015-10-07 07:28:46 -07:00
Chad Versace	24de3d49ea	vk: Embed two surface states in anv_image_view This prepares for merging VkAttachmentView into VkImageView. The two surface states are: anv_image_view::color_rt_surface_state: RENDER_SURFACE_STATE when using image as a color render target. anv_image_view::nonrt_surface_state; RENDER_SURFACE_STATE when using image as a non render target. No Crucible regressions.	2015-10-06 21:22:18 -07:00
Chad Versace	37bf120930	vk/pipeline: Emit MSAA finishme only if samples > 1 If samples == 1, then there's nothing for Mesa to do, and the finishme message is only noise.	2015-10-06 21:22:18 -07:00
Chad Versace	3fc2b1f325	vk: Remove stale finishme for stencil image views They don't work completely. But they work well enough to satisfy Crucible.	2015-10-06 21:22:18 -07:00
Chad Versace	44143a1f46	vk: Add anv_image::usage It's a copy of VkImageCreateInfo::usage. Will be used for the VkAttachmentView/VkImageView merge.	2015-10-06 21:22:18 -07:00
Chad Versace	cf603714cb	vk/meta: Fix usage flags for image-wrapped-buffers In make_image_for_buffer(), use VK_IMAGE_USAGE_SAMPLED_BIT when transferring from the buffer and use VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT when transferring to the buffer.	2015-10-06 21:22:18 -07:00
Chad Versace	d00718104f	vk/image: Remove stale anv_asserts for depthstencil attachments We don't fully handle mipmapped, array depthstencil attachments. But we handle the well enough for Crucible's miptree tests.	2015-10-06 21:22:18 -07:00
Kristian Høgsberg Kristensen	1d7ef82f4b	i965: Delete brw_cs.cpp which was deleted in master	2015-10-06 15:20:19 -07:00
Jason Ekstrand	c272bb58f5	nir/spirv: Better texture handling	2015-10-06 15:10:45 -07:00
Jason Ekstrand	ccea9cc332	nir/spirv: Update to SPIR-V Rev. 32	2015-10-06 14:52:35 -07:00
Jason Ekstrand	89eebd889c	vk/0.170.2: Fairly trivial enum shuffling	2015-10-06 14:08:08 -07:00
Jason Ekstrand	1e4263b7d2	vk/0.170.2: s/baseArraySlice/baseArrayLayer/	2015-10-06 14:08:08 -07:00
Chad Versace	d4446a7e58	vk: Merge anv_attachment_view into anv_image_view This prepares for merging VkAttachmentView into VkImageView.	2015-10-06 12:13:03 -07:00
Chad Versace	6b5ce5daf5	vk: Update comments for anv_image_view - Document the extent member. It's the extent of the view's base level. - s/VkAttachmentView/VkImageView/	2015-10-06 12:12:52 -07:00
Jason Ekstrand	19018c9f13	vk/0.170.2: Add a stage field to ShaderCreateInfo	2015-10-06 10:20:10 -07:00
Jason Ekstrand	cc389b1482	vk/0.170.2: Rename cs to stage in ComputePipelineCreateInfo	2015-10-06 10:11:50 -07:00
Jason Ekstrand	588d40e97a	vk/0.170.2: Use ImageSubresourceCopy in ImageResolve	2015-10-06 10:09:47 -07:00
Jason Ekstrand	bd4cde708a	vk/0.170.2: Rename fields in VkClearColorValue	2015-10-06 10:07:47 -07:00
Jason Ekstrand	81c7fa8772	vk/0.170.2: Rework blits to use ImageSubresourceCopy	2015-10-06 10:04:04 -07:00
Jason Ekstrand	ba2254aa79	vulkan.h: Move stuff around This has no functional change but substantially decreases the diff with the 0.170.2 header.	2015-10-06 09:50:04 -07:00
Jason Ekstrand	d1908d2c33	vk/0.170.2: Rework parameters to CmdClearDepthStencil functions	2015-10-06 09:40:39 -07:00
Jason Ekstrand	02a9be31d6	vk/0.170.2: Add the flags parameter to GetPhysicalDeviceImageFormatProperties	2015-10-06 09:37:21 -07:00
Jason Ekstrand	a145acd812	vk/0.170.2: Remove the pCount parameter from AllocDescriptorSets	2015-10-06 09:32:01 -07:00
Jason Ekstrand	8ba684cbad	vk/0.170.2: Rename extension and layer query functions	2015-10-06 09:25:03 -07:00
Jason Ekstrand	a6eba403e2	vk/0.170.2: Update to the new queue family properties query	2015-10-05 21:17:12 -07:00
Jason Ekstrand	65964cd49b	vk/0.170.2: Re-arrange parameters of vkCmdDraw[Indexed]	2015-10-05 21:10:20 -07:00
Jason Ekstrand	05a26a60c8	vk/0.170.2: Make destructors return void	2015-10-05 20:50:51 -07:00
Jason Ekstrand	460676122f	vk/0.170.2: Rename VkClearValue.ds to depthStencil	2015-10-05 20:35:08 -07:00
Jason Ekstrand	8e1ef639b6	vk/0.170.2: Add the subpass field to VkCmdBufferBeginInfo	2015-10-05 20:30:53 -07:00
Jason Ekstrand	757166592e	vk/0.170.2: Rename pointer parameters of VkSubpassDescription	2015-10-05 20:26:21 -07:00
Jason Ekstrand	57f500324b	vk/0.170.2: Add unnormalizedCoordinates to VkSamplerCreateInfo	2015-10-05 20:17:24 -07:00
Jason Ekstrand	f7c3519aaf	vk/0.170.2: Rename VkTexAddress to VkTexAddressMode	2015-10-05 20:15:06 -07:00
Jason Ekstrand	39a19e88a3	vulkan.h: Various cosmetic changes These don't affect the driver in any way.	2015-10-05 20:06:30 -07:00
Chad Versace	9357062348	vk: Merge anv_*_attachment_view into anv_attachment_view Remove anv_color_attachment_view and anv_depth_stencil_view, merging them into anv_attachment_view. This prepares for merging VkAttachmentView into VkImageView.	2015-10-05 17:46:04 -07:00
Chad Versace	ae30535602	vk: Drop anv_attachment_view::extent It's duplicated by anv_attachment_view::image_view::extent.	2015-10-05 17:46:04 -07:00
Chad Versace	f0f4dfa9cc	vk: Drop anv_surface_view Push the members of struct anv_surface_view into anv_image_view and anv_buffer_view, then remove struct anv_surface_view. Observe that anv_surface_view::range is not needed for anv_image_view, and so was dropped there. This prepares for the merge of VkAttachmentView into VkImageView. Remove the common parent of anv_buffer_view and anv_image_view (that is, anv_surface_view) will make the merge easier.	2015-10-05 17:46:04 -07:00
Chad Versace	74193a880f	vk: Use consistent names for anv__view variables Rename all anv__view variables to follow this convention: - sview -> anv_surface_view - bview -> anv_buffer_view - iview -> anv_image_view - aview -> anv_attachment_view - cview -> anv_color_attachment_view - ds_view -> anv_depth_stencil_attachment_view This clarifies existing code. And it will reduce noise in the upcoming commits that merge VkAttachmentView into VkImageView.	2015-10-05 17:46:04 -07:00
Chad Versace	ffd051830d	vk: Unionize anv_desciptor For a given struct anv_descriptor, all members are NULL (in which case the descriptor is empty) or exactly one member is non-NULL. To make struct anv_descriptor better reflect its set of valid states, convert the struct into a tagged union.	2015-10-05 17:46:04 -07:00
Chad Versace	63439953d7	vk: Drop dependency on no longer extant header anv_meta no longer uses GLSL shaders, and the build system no longer converts them to SPIR-V. So remove anv_meta_spirv_autogen.h from Makefile.am. (cherry picked from commit `2fc8122f66`)	2015-10-05 17:06:19 -07:00
Chad Versace	2fc8122f66	vk: Drop dependency on no longer extant header anv_meta no longer uses GLSL shaders, and the build system no longer converts them to SPIR-V. So remove anv_meta_spirv_autogen.h from Makefile.am.	2015-10-05 17:04:18 -07:00
Chad Versace	8bf021cf3d	vk: Return anv_image_view_info by value The struct is only 2 bytes. Returning it on the stack is better than returning a reference into the ELF .data segment.	2015-10-05 13:22:44 -07:00
Chad Versace	4ffb4549e0	vk/image: Document a Vulkan spec requirement for depthstencil The Vulkan spec (git a511ba2) requires support for some combined depth stencil formats.	2015-10-05 13:18:44 -07:00
Chad Versace	3530224063	vk: Annotate anv_cmd_state::gen7::index_type It's the value of 3DSTATE_INDEX_BUFFER.IndexFormat.	2015-10-05 08:58:35 -07:00
Chad Versace	9c93aa9141	vk: Better types for VkShaderStage, VkShaderStageFlags vars In most places, the variable type was the uninformative uint32_t.	2015-10-05 08:55:09 -07:00
Chad Versace	6317c3144d	vk/0.170.2: Drop VK_BUFFER_USAGE_GENERAL	2015-10-05 08:12:59 -07:00
Chad Versace	4744f60e79	vk/0.170.2: Drop enum VkBufferViewType	2015-10-05 08:12:58 -07:00
Chad Versace	7a089bd1a6	vk/0.170.2: Update VkImageSubresourceRange Replace 'aspect' with 'aspectMask'.	2015-10-05 08:10:57 -07:00
Chad Versace	568654d606	vk/0.170.2: Drop VK_IMAGE_USAGE_GENERAL	2015-10-05 08:09:33 -07:00
Chad Versace	6a40af1b08	vk/0.170.2: Update VkPipelineMultisampleStateCreateInfo	2015-10-04 10:00:25 -07:00
Chad Versace	dd04be491d	vk/0.170.2: Update Vk VkPipelineDepthStencilStateCreateInfo Rename member depthBoundsEnable -> depthBoundsTestEnable.	2015-10-04 09:41:46 -07:00
Chad Versace	8cb2e27c62	vk/0.170.2: Update VkRenderPassBeginInfo Rename members: attachmentCount -> clearValueCount pAttachmentClearValues -> pClearValues	2015-10-04 09:26:25 -07:00
Chad Versace	3694518be5	vk/0.170.2: Drop VkBufferViewCreateInfo::viewType	2015-10-04 09:14:57 -07:00
Chad Versace	216d9f248d	vk: Copy current header to vulkan-0.138.2.h While upgrading Mesa to the new 0.170.2 API, it's convenient to have all three headers available in the tree: - vulkan-0.138.2.h, the old one - vulkan-0.170.2.h, the new one - vulkan.h, the one in transition	2015-10-04 09:09:35 -07:00
Chad Versace	7f18ed4b9f	vk: Import header 0.170.2 header LunarG SDK From the LunarG SDK at tag sdk-0.9.1, import vulkan.h as vulkan-0.170.2.h. This header is the first provisional header with the addition of minor fixes.	2015-10-04 09:09:31 -07:00
Jason Ekstrand	09ba0a7c05	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-10-03 11:32:29 -07:00
Jason Ekstrand	ef56cf7738	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-10-02 16:52:47 -07:00
Jason Ekstrand	10f97718c3	anv/allocator: Add a sanity assertion in state stream finish. We assert that the block offset we got while walking the list of blocks is actually a multiple of the block size. If something goes wrong and the GPU decides to stomp on the surface state buffer we can end up getting corruptions in our list of blocks. This assertion makes such corruptions a crash with a meaningful message rather than an infinite loop.	2015-10-02 16:24:42 -07:00
Jason Ekstrand	002e7b0cc3	anv: Remove the GLSL -> SPIR-V scraper/converter This was very useful to get us up-and-going. However, now that we can use NIR directly for meta shaders, we don't need this anymore and we might as well drop the glslc dependency.	2015-10-02 16:20:04 -07:00
Jason Ekstrand	f5ffb0e0cb	anv/meta: Use NIR directly for blit shaders	2015-10-02 16:18:44 -07:00
Jason Ekstrand	7851a4392a	anv/meta: Use NIR directly for clear shaders	2015-10-02 16:18:32 -07:00
Jason Ekstrand	add99c4beb	anv: Add a back-door for passing NIR shaders directly into the pipeline This will allow us to use NIR directly for meta operations rather than having to go through SPIR-V.	2015-10-02 16:16:57 -07:00
Jason Ekstrand	b68805f83c	anv: Add some NIR builder helpers These should all eventually be up-streamed. However, since they currently have no upstream users, they would just bitrot there. We'll keep them local for the time being.	2015-10-02 16:15:53 -07:00
Jason Ekstrand	c1553653a2	vk/wsi/x11: Send OUT_OF_DATE if the X drawable goes away	2015-10-02 13:44:53 -07:00
Kristian Høgsberg Kristensen	005c8e0106	Merge branch 'master' of ../mesa into vulkan	2015-10-01 14:24:29 -07:00
Jason Ekstrand	337caee910	anv/wsi_x11: Properly report BadDrawable errors to the client	2015-09-28 20:18:41 -07:00
Jason Ekstrand	f06bc45b0c	anv/batch_chain: Use the surface state pool for binding tables	2015-09-28 16:01:14 -07:00
Jason Ekstrand	d93f6385a7	anv/batch_chain: Add helpers for fixing up block_pool relocations	2015-09-28 16:01:14 -07:00
Jason Ekstrand	8c00f9ab56	anv/gen8: Do a render cache flush prior to changing state base address	2015-09-28 16:01:14 -07:00
Jason Ekstrand	0e94446b25	anv/device: Use a 4K block size for surface state blocks We want to start using the surface state block pool for binding tables and binding tables. In order to do this, we need to be able to set surface state base address to the address of a block and surface state base address has a 4K alignment requriement.	2015-09-28 16:01:01 -07:00
Jason Ekstrand	737e89bc8d	anv/meta: Use the dynamic state stream for temporary buffers	2015-09-28 16:01:01 -07:00
Jason Ekstrand	219a1929f7	anv/util: Add helpers for getting the first and last elements of a vector	2015-09-28 16:01:01 -07:00
Jason Ekstrand	95487668df	anv/batch_chain: Add a _alloc_binding_table function	2015-09-28 16:01:01 -07:00
Jason Ekstrand	d517de6126	anv: Make anv_state.offset an int32_t Binding tables will have a negative offset and we need a way to express that. Besides, the chances of a state offset being larger than 2 GB is so remote it's not worth thinking about.	2015-09-28 16:01:01 -07:00
Jason Ekstrand	9ac3dde3a0	anv/wsi_wayland: Fix FIFO mode Previously, there were a number of things we were doing wrong: 1) We weren't flushing the wl_display so dead-looping clients weren't guaranteed to work. 2) We were sending the frame event after calling wl_surface.commit() so it wasn't getting assigned to the correct frame 3) We weren't actually setting fifo_ready to false. Unfortunately, we never noticed because (3) was hiding the other two. This commit fixes all three and clients that use FIFO mode are now properly refresh-rate limited.	2015-09-28 15:58:34 -07:00
Chad Versace	ddcedb979a	vk: Implement vkGetPhysicalDeviceImageFormatProperties() The implementation is incomplete because we lie about VkImageFormatProperties::maxResourceSize, hardcoding it to UINT32_MAX for all supported cases.	2015-09-28 11:53:39 -07:00
Chad Versace	9f3122db0e	vk: Refactor anv_GetPhysicalDeviceFormatProperties() Move the bulk of the function body to a new function anv_physical_device_get_format_properties(). This allows us to reuse the function when implementing anv_GetPhysicalDeviceImageFormatProperties() without calling into the public entry point.	2015-09-28 11:53:39 -07:00
Chad Versace	c15ce5c834	vk: Advertise that depthstencil formats support sampling Let vkGetPhysicalDeviceFormatProperties() set VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT for tiled depthstencil images.	2015-09-28 11:53:39 -07:00
Jason Ekstrand	4e48f94469	anv/device: Wrap a couple valgrind calls in the VG macro This fixes the build for systems that don't have valgrind devel packages installed.	2015-09-28 11:18:52 -07:00
Chad Versace	97636345da	vk: Fix vkGetPhysicalDeviceSparseImageFormatProperties() The driver does not yet support sparse images, so return zero properties for all formats.	2015-09-28 10:17:48 -07:00
Kristian Høgsberg Kristensen	164f08c255	vk: Add anv_icd.json to .gitignore	2015-09-25 15:16:56 -07:00
Kristian Høgsberg Kristensen	850cfcad3e	vk: Also define vk_errorf in non-debug builds	2015-09-25 15:15:37 -07:00
Kristian Høgsberg Kristensen	cf24211d55	vk: Roll back GLSL parser support for vulkan In the interest of reducing our delta to mesa master, let's undo these changes now that we only support SPIR-V.	2015-09-25 10:42:07 -07:00
Jason Ekstrand	e9dff5bb99	vk: Add an ICD declaration file	2015-09-24 14:45:58 -07:00
Jason Ekstrand	39cd3783a4	anv: Add support for the ICD loader	2015-09-24 14:45:58 -07:00
Jason Ekstrand	a95f51c1d7	anv: Add a global dispatch table for use in meta operations	2015-09-24 14:45:58 -07:00
Jason Ekstrand	00d18a661f	anv/entrypoints: Expose the anv_resolve_entrypoint function	2015-09-24 14:45:58 -07:00
Jason Ekstrand	f5e72695e0	anv/entrypoints: Rename anv_layer to anv_dispatch_table	2015-09-24 14:45:58 -07:00
Jason Ekstrand	913a9b76f7	anv/batch_chain: Remove the current_surface_bo helper It's no longer used outside anv_batch_chain so we certainly don't need to be exporting. Inside anv_batch_chain, it's only used twice and it can be replaced by a single line so there's really no point.	2015-09-24 08:46:41 -07:00
Jason Ekstrand	bc17f9c9d7	anv/cmd_buffer: Add a helper for getting the surface state base address	2015-09-24 08:42:38 -07:00
Jason Ekstrand	e1a7c721d3	anv/allocator: Don't ever call mremap This has always been a bit sketchy and neither Kristian nor I have ever really liked it.	2015-09-24 08:42:14 -07:00
Jason Ekstrand	99e62f5ce8	anv/allocator: Delete the unused center_fd_offset from anv_block_pool	2015-09-24 08:41:56 -07:00
Jason Ekstrand	429665823d	anv/allocator: Do a better job of centering bi-directional block pools	2015-09-24 08:41:47 -07:00
Jason Ekstrand	76be58efce	anv/batch_chain: Clean up the reloc list swapping code	2015-09-24 08:41:38 -07:00
Jason Ekstrand	041f5ea089	anv/meta: Add location specifiers to meta shaders	2015-09-21 16:21:56 -07:00
Jason Ekstrand	f406b708a5	Merge branch 'nir-spirv' into vulkan	2015-09-17 20:03:40 -07:00
Jason Ekstrand	616db92b01	nir/spirv: Add better location handling Previously, our location handling was focussed on either no location (usually implicit 0) or a builting. Unfortunately, if you gave it a location, it would blow it away and just not care. This worked fine with crucible and our meta shaders but didn't work with the CTS. The new code uses the "data.explicit_location" field to denote that it has a "final" location (usually from a builtin) and, otherwise, the location is considered to be relative to the base for that shader stage.	2015-09-17 20:02:46 -07:00
Jason Ekstrand	a788e7c659	anv/device: Move mutex initialization to befor block pools	2015-09-17 18:23:21 -07:00
Jason Ekstrand	595e6cacf1	meta: Initial support for packing parameters Probably incomplete but it should do for now	2015-09-17 18:21:05 -07:00
Jason Ekstrand	d616493953	anv/meta: Pass the depth through the clear vertex shader It shouldn't matter since we shut off the VS but it's at least clearer.	2015-09-17 18:09:21 -07:00
Jason Ekstrand	3b8aa26b8e	anv/formats: Properly report depth-stencil formats	2015-09-17 17:44:20 -07:00
Jason Ekstrand	b5f6889648	vk/device: Don't allow device or instance creation with invalid extensions	2015-09-17 17:44:20 -07:00
Jason Ekstrand	dcf424c98c	anv/tests: Add some asserts for data integrity in block_pool_no_free	2015-09-17 17:44:20 -07:00
Jason Ekstrand	5f57ff7e18	anv/allocator: Make the block pool double-ended This allows us to allocate from either side of the block pool in a consistent way. If you use the previous block_pool_alloc function, you will get offsets from the start of the pool as normal. If you use the new block_pool_alloc_back function, you will get a negative index that corresponds to something in the "back" of the pool.	2015-09-17 17:44:20 -07:00
Jason Ekstrand	15624fcf55	anv/tests: Refactor the block_pool_no_free test This simply breaks the monotonicity check out into its own function	2015-09-17 17:44:20 -07:00
Jason Ekstrand	55daed947d	vk/allocator: Split block_pool_alloc into two functions	2015-09-17 17:44:20 -07:00
Jason Ekstrand	c55fa89251	anv/allocator: Use a signed 32-bit offset for the free list This has the unfortunate side-effect of making it so that we can't have a block pool bigger than 1GB. However, that's unlikely to happen and, for the sake of bi-directional block pools, we need to negative offsets.	2015-09-17 17:44:20 -07:00
Jason Ekstrand	8c6bc1e85d	anv/allocator: Create 2GB memfd up-front for the block pool	2015-09-17 17:44:20 -07:00
Jason Ekstrand	74bf7aa07c	anv/allocator: Take the device mutex when growing a block pool We don't have any locking issues yet because we use the pool size itself as a mutex in block_pool_alloc to guarantee that only one thread is resizing at a time. However, we are about to add support for growing the block pool at both ends. This introduces two potential races: 1) You could have two block_pool_alloc() calls that both try to grow the block pool, one from each end. 2) The relocation handling code will now have to think about not only the bo that we use for the block pool but also the offset from the start of that bo to the center of the block pool. It's possible that the block pool growing code could race with the relocation handling code and get a bo and offset out of sync. Grabbing the device mutex solves both of these problems. Thanks to (2), we can't really do anything more granular.	2015-09-17 17:44:20 -07:00
Jason Ekstrand	222ddac810	anv: Document the index and offset parameters of anv_bo	2015-09-17 17:44:20 -07:00
Chad Versace	85520aa070	vk/image: Remove stale FINISHME for non-2D image views gen8_image_view_init() now supports 1D, 2D, and 3D image views.	2015-09-14 15:16:57 -07:00
Chad Versace	622a317e4c	vk/image: Teach vkCreateImage about layout of 1D surfaces Calling vkCreateImage() with VK_IMAGE_TYPE_1D now succeeds and computes the surface layout correctly.	2015-09-14 15:15:12 -07:00
Chad Versace	6221593ff8	vk/meta: Partially implement vkCmdCopy, vkCmdBlit for 3D images Partially implement the below functions for 3D images: vkCmdCopyBufferToImage vkCmdCopyImageToBuffer vkCmdCopyImage vkCmdBlitImage Not all features work, and there is much for performance improvement. Beware that vkCmdCopyImage and vkCmdBlitImage are untested. Crucible proves that vkCmdCopyBufferToImage and vkCmdCopyImageToBuffer works, though. Supported: - copy regions with z offset Unsupported: - copy regions with extent.depth > 1 Crucible test results on master@d452d2b are: pass: func.miptree.r8g8b8a8-unorm..view-3d. pass: func.miptree.d32-sfloat..view-3d. fail: func.miptree.s8-uint..view-3d.	2015-09-14 14:27:34 -07:00
Chad Versace	0ecafe0285	vk/meta: Rename meta_emit_blit() params Rename src -> src_view and dest -> dest_view. This reduces noise in the next patch's diff, which adds new params to the function.	2015-09-14 12:29:51 -07:00
Chad Versace	b659a066e9	vk/gen8: Set RENDER_SURFACE_STATE::RenderTargetViewExtent	2015-09-14 12:29:49 -07:00
Chad Versace	ffa61e1572	vk/gen8: Refactor setting of SURFACE_STATE::Depth The field's meaning depends on SURFACE_STATE::SurfaceType. Make that correlation explicit by switching on VkImageType. For good measure, add some PRM quotes too.	2015-09-14 12:27:05 -07:00
Chad Versace	eed74e3a02	vk: Teach vkCreateImage about layout of 3D surfaces Calling vkCreateImage() with VK_IMAGE_TYPE_3D now succeeds and computes the surface layout correctly. However, 3D images do not yet work for many other Vulkan entrypoints.	2015-09-14 11:04:08 -07:00
Chad Versace	e01d5a0471	vk: Refactor anv_image_make_surface() Move the code that calculates the layout of 2D surfaces into a switch case.	2015-09-14 11:00:18 -07:00
Jason Ekstrand	8c8ad6dddf	vk: Use push constants for dynamic buffers	2015-09-11 15:56:19 -07:00
Jason Ekstrand	2b4a2eb592	vk/compiler: Rework create_params_array	2015-09-11 15:55:54 -07:00
Jason Ekstrand	c3086c54a8	vk/compiler: Add a NIR pass for pushing dynamic buffer offset This commit just adds the NIR pass but does none of the uniform setup	2015-09-11 15:53:56 -07:00
Jason Ekstrand	7487371056	vk/pipeline_layout: Add dynamic_offset_start and has_dynamic_offsets fields	2015-09-11 15:52:43 -07:00
Jason Ekstrand	de5220c7ce	vk/pipeline_layout: Move surface/sampler start from SoA to AoS This makes more sense to me and it's more consistent with anv_descriptor_set_layout.	2015-09-11 10:43:55 -07:00
Jason Ekstrand	b908c67816	vk: Rework the push constants data structure Previously, we simply had a big blob of stuff for "driver constants". Now, we have a very specific data structure that contains the driver constants that we care about.	2015-09-11 10:25:23 -07:00
Jason Ekstrand	fd21f0681a	Add the wayland protocol files to .gitignire	2015-09-11 09:29:40 -07:00
Jason Ekstrand	8040dc4ca5	vk/error: Handle ERROR_OUT_OF_DATE_WSI	2015-09-08 12:13:07 -07:00
Jason Ekstrand	060720f0c9	vk/wsi/x11: Actually block on X so we don't re-use busy buffers	2015-09-08 11:51:47 -07:00
Jason Ekstrand	1bee19e023	vk: Add the WSI header files	2015-09-08 10:33:46 -07:00
Jason Ekstrand	2f3de6260d	Merge branch 'nir-spirv' into vulkan	2015-09-05 14:12:59 -07:00
Jason Ekstrand	4d73ca3c58	nir/spirv.h: Remove some cruft missed while merging There were merge conflicts in spirv.h that got missed because they were in a comment and so it still compiled. This gets rid of them and we should be on-par with upstream spirv->nir.	2015-09-05 14:11:40 -07:00
Jason Ekstrand	612b13aeae	nir/spirv: Add support for most of the rest of texturing Assuming this all works, about the only thing left should be some corner-cases for tg4	2015-09-05 14:10:05 -07:00
Jason Ekstrand	fe786ff67d	Merge branch 'nir-spirv' into vulkan	2015-09-05 13:17:53 -07:00
Jason Ekstrand	35fcd37fcf	nir/spirv: Handle decorations after assigning variable locations	2015-09-05 13:17:21 -07:00
Jason Ekstrand	87d02f515b	Merge branch 'nir-spirv' into vulkan	2015-09-05 09:48:33 -07:00
Jason Ekstrand	9be43ef99c	nir/spirv: Handle the MatrixStride member decoration	2015-09-05 09:47:45 -07:00
Jason Ekstrand	01924a03d4	vk: Actually link in wayland libraries Turns out this was why I had accidentally broken the universe. Oops...	2015-09-04 20:02:38 -07:00
Jason Ekstrand	2c4ae00db6	vk: Conditionally compile Wayland support Pulling in libwayland causes undefined symbols in applications that are linked against vulkan alone. Ideally, we would like to dlopen a platform support library or something like that. For now, this works and should get crucible running again.	2015-09-04 19:18:52 -07:00
Jason Ekstrand	b3c037f329	vk: Fix size return value handling in a couple plces	2015-09-04 19:05:51 -07:00
Jason Ekstrand	9a95d08ed6	Merge branch 'nir-spirv' into vulkan	2015-09-04 18:54:15 -07:00
Jason Ekstrand	6d5dafd779	nir/spirv/glsl450: Use the correct write mask	2015-09-04 18:50:14 -07:00
Jason Ekstrand	7174d155e9	nir: Add a lower_fdiv option and use it in i965	2015-09-04 18:50:14 -07:00
Jason Ekstrand	f32d16a9f0	nir/spirv: Use the actual GLSL 450 extension header from Khronos	2015-09-04 18:50:14 -07:00
Jason Ekstrand	9e2c13350e	nir/spirv: Add support for SpvDecorationColMajor	2015-09-04 18:50:14 -07:00
Jason Ekstrand	f3bdb93a8e	nir/types: Allow single-column matrices This can sometimes be a convenient way to build vectors.	2015-09-04 18:50:14 -07:00
Jason Ekstrand	48e87c0163	vk/wsi: Add Wayland WSI support	2015-09-04 17:55:42 -07:00
Jason Ekstrand	348cb29a20	vk/wsi: Move to a clallback system for the entire WSI implementation We do this for two reasons: First, because it allows us to simplify WSI and compiling in/out support for a particular platform is as simple as calling or not calling the platform-specific init function. Second, the implementation gives us a place for a given chunk of the WSI to stash stuff in the instance.	2015-09-04 17:55:42 -07:00
Jason Ekstrand	06d8fd5881	vk/instance: Expose anv_instance_alloc/free	2015-09-04 17:55:42 -07:00
Jason Ekstrand	c0b97577e8	vk/WSI: Use a callback mechanism instead of explicit switching	2015-09-04 17:55:42 -07:00
Jason Ekstrand	ca3cfbf6f1	vk: Add an initial implementation of the actual Khronos WSI extension Unfortunately, this is a very large commit and removes the old LunarG WSI extension. This is because there are a couple of entrypoints that have the same name between the two extensions so implementing them both is impractiacl. Support is still incomplete, but this is enough to get vkcube up and going again.	2015-09-04 17:55:42 -07:00
Jason Ekstrand	3d9fbb6575	vk: Add initial support for VK_WSI_swapchain	2015-09-04 17:55:42 -07:00
Jason Ekstrand	beb466ff5b	vk: Move anv_x11.c to anv_wsi_x11.c	2015-09-04 17:55:42 -07:00
Jason Ekstrand	9a7600c9b5	vk/device: Use an array for device extensions	2015-09-04 17:55:42 -07:00
Kristian Høgsberg Kristensen	8af3624651	vk: Further reduce diff to master Now that we don't compile GLSL, we can roll back a few more hacks and unexport some things from the backend compiler. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-04 16:17:01 -07:00
Kristian Høgsberg Kristensen	7c1d20dc48	vk: Drop GLSL code from anv_compiler.cpp Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 14:02:11 -07:00
Kristian Høgsberg Kristensen	316c8ac53b	vk: Assert that the SPIR-V module has the magic number Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 12:27:28 -07:00
Kristian Høgsberg Kristensen	6e35a1f166	vk: Remove various hacks/scaffolding code Since we switched away from calling brwCreateContext() there's a bit of hacky support we can now delete. This reduces our diff to upstream master. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 12:17:13 -07:00
Kristian Høgsberg Kristensen	1d787781ff	vk: Fall back to previous gens in entry point resolver We used to always just do a one-level fallback from genX_* to anv_* entry points. That worked for gen7 and gen8 where all entry points were either different or could be made anv_* entry points (eg anv_CreateDynamicViewportState). We're about to add gen9 and now need to be able to fall back to gen8 entry points for most things. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen	c4dbff58d8	vk: Drop redundant gen7_CreateGraphicsPipelines This is handled by anv_CreateGraphicsPipelines(). Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen	b5e90f3f48	vk: Use vk* entrypoints in meta, not driver_layer pointers We'll change the dispatch mechanism again in a later commit. Stop using the driver_layer function pointers and just use the public entry points. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen	82396a5514	vk: Drop check for I915_PARAM_HAS_EXEC_CONSTANTS We don't use this kernel feature. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:08 -07:00
Kristian Høgsberg Kristensen	c4b30e7885	vk: Add new vk_errorf that takes a format string This allows us to annotate error cases in debug builds. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:08 -07:00
Kristian Høgsberg Kristensen	2e346c882d	vk: Make vk_error a little more helpful Print out file and line number and translate the error code to the symbolic name. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:08 -07:00
Chad Versace	0cb26523d3	vk/image: Add PRM reference for QPitch equation Suggested-by: Nanley Chery <nanley.g.chery@intel.com>	2015-09-03 11:04:38 -07:00
Chad Versace	28503191f1	vk/meta: Partially fix vkCmdCopyBufferToImage for S8_UINT Create R8_UINT VkAttachmentView and VkImageView for the stencil data. This fixes a crash, but the pixels in the destination image are still incorrect. They are not properly tiled. Fixes crashes in Crucible tests func.miptree.s8-uint.aspect-stencil.* as of crucible-7471449. Test results improve 'lost' -> 'fail'.	2015-09-02 11:08:36 -07:00
Jason Ekstrand	be0a4da6a5	vk/meta: Use SPIR-V for shaders We are also now using glslc for compiling the Vulkan driver like we do in curcible.	2015-09-01 15:16:06 -07:00
Jason Ekstrand	362ab2d788	vk/compiler: Handle interpolation qualifiers for SPIR-V shaders	2015-09-01 15:15:04 -07:00
Jason Ekstrand	126ade0023	vk/extensions: count needs to be <= number of extensions	2015-09-01 12:28:50 -07:00
Jason Ekstrand	0c2d476935	vk/compiler: Properly reference/delete programs when using SPIR-V	2015-09-01 12:28:50 -07:00
Jason Ekstrand	16ebe883a4	vk/meta: Add a helper for making an image from a buffer	2015-08-31 21:54:38 -07:00
Jason Ekstrand	86c3476668	nir/spirv: Use VERTEX_ID_ZERO_BASE for VertexId In Vulkan, VertexId and InstanceId will be zero-based and new intrinsics, VertexIndex and InstanceIndex, will be added for non-zer-based. See also, Khronos bug #14255	2015-08-31 17:16:49 -07:00
Jason Ekstrand	6350c97412	Merge remote-tracking branch 'fdo-personal/nir-spirv' into vulkan From now on, the majority of SPIR-V improvements should happen on the spirv branch which will also be public. It will be frequently merged into the vulkan driver.	2015-08-31 17:14:47 -07:00
Jason Ekstrand	22fdb2f855	nir/spirv: Update to the latest revision	2015-08-31 17:05:23 -07:00
Jason Ekstrand	ce70cae756	nir/builder: Use nir_after_instr to advance the cursor This should ensure that the cursor gets properly advanced in all cases. We had a problem before where, if the cursor was created using nir_after_cf_node on a non-block cf_node, that would call nir_before_block on the block following the cf node. Instructions would then get inserted in backwards order at the top of the block which is not at all what you would expect from nir_after_cf_node. By just resetting to after_instr, we avoid all these problems.	2015-08-31 17:05:23 -07:00
Jason Ekstrand	24b0c53231	nir/intrinsics: Move to a two-dimensional binding model for UBO's	2015-08-31 17:05:23 -07:00
Jason Ekstrand	f4608bc530	nir/nir_variable: Add a descriptor set field We need this for SPIR-V	2015-08-31 17:05:23 -07:00
Jason Ekstrand	85cf2385c5	mesa: Move gl_vert_attrib from mtypes.h to shader_enums.h It is a shader enum after all...	2015-08-31 17:05:23 -07:00
Jason Ekstrand	de4f379a70	nir/cursor: Add a helper for getting the current block	2015-08-31 17:05:23 -07:00
Connor Abbott	024c49e95e	nir/builder: add a nir_fdot() convenience function	2015-08-31 17:05:23 -07:00
Jason Ekstrand	f6a0eff1ba	nir: Add a pass to lower outputs to temporary variables This pass can be used as a helper for NIR producers so they don't have to worry about creating the temporaries themselves.	2015-08-31 17:05:23 -07:00
Jason Ekstrand	4956bbaa33	nir/cursor: Add a constructor for the end of a block but before the jump	2015-08-31 16:58:20 -07:00
Connor Abbott	c62be38286	nir/types: add more nir_type_is_xxx() wrappers	2015-08-31 16:58:20 -07:00
Connor Abbott	a1e136711b	nir/types: add a helper to transpose a matrix type	2015-08-31 16:58:20 -07:00
Jason Ekstrand	756b00389c	nir/spirv: Don't assert that the current block is empty It's possible that someone will give us SPIR-V code in which someone needlessly branches to new blocks. We should handle that ok now.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	fe220ebd37	nir/spirv: Add initial support for samplers	2015-08-31 16:58:20 -07:00
Jason Ekstrand	a992909aae	nir/spirv: Move Exp and Log to the list of currently unhandled ALU ops NIR doesn't have the native opcodes for them anymore	2015-08-31 16:58:20 -07:00
Jason Ekstrand	45963c9c64	nir/types: Add support for sampler types	2015-08-31 16:58:20 -07:00
Jason Ekstrand	2887e68f36	nir/spirv: Make the global constants in spirv.h static I've been promissed in a bug that this will be fixed in a future version of the header. However, in the interest of my branch building, I'm adding these changes in myself for the moment.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	62b094a81c	nir/spirv: Handle jump-to-loop in a more general way	2015-08-31 16:58:20 -07:00
Jason Ekstrand	ca51d926fd	nir/spirv: Handle boolean uniforms correctly	2015-08-31 16:58:20 -07:00
Jason Ekstrand	b6562bbc30	nir/spirv: Handle control-flow with loops	2015-08-31 16:58:20 -07:00
Jason Ekstrand	4a63761e1d	nir/spirv: Set a name on temporary variables	2015-08-31 16:58:20 -07:00
Jason Ekstrand	6fc7911d15	nir/spirv: Use the correct length for copying string literals	2015-08-31 16:58:20 -07:00
Jason Ekstrand	9da6d808be	nir/spirv: Make vtn_ssa_value handle constants as well as ssa values	2015-08-31 16:58:20 -07:00
Jason Ekstrand	1feeee9cf4	nir/spirv: Add initial support for GLSL 4.50 builtins	2015-08-31 16:58:20 -07:00
Jason Ekstrand	577c09fdad	nir/spirv: Split the core datastructures into a header file	2015-08-31 16:58:20 -07:00
Jason Ekstrand	66fc7f252f	nir/spirv: Use the builder for all instructions We don't actually use it to create all the instructions but we do use it for insertion always. This should make things far more consistent for implementing extended instructions.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	9e03b6724c	nir/spirv: Add support for a bunch of ALU operations	2015-08-31 16:58:20 -07:00
Jason Ekstrand	91b3b46d8b	nir/spirv: Add support for indirect array accesses	2015-08-31 16:58:20 -07:00
Jason Ekstrand	9197e3b9fc	nir/spirv: Explicitly type constants and SSA values	2015-08-31 16:58:20 -07:00
Jason Ekstrand	b7904b8281	nir/spirv: Handle OpBranchConditional We do control-flow handling as a two-step process. The first step is to walk the instructions list and record various information about blocks and functions. This is where the acutal nir_function_overload objects get created. We also record the start/stop instruction for each block. Then a second pass walks over each of the functions and over the blocks in each function in a way that's NIR-friendly and actually parses the instructions.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	d216dcee94	nir/spirv: Add a helper for getting a value as an SSA value	2015-08-31 16:58:20 -07:00
Jason Ekstrand	f36fabb736	nir/spirv: Split instruction handling into preamble and body sections	2015-08-31 16:58:20 -07:00
Jason Ekstrand	7bf4b53f1c	nir/spirv: Implement load/store instructiosn	2015-08-31 16:58:20 -07:00
Jason Ekstrand	7d64741a5e	nir: Add a helper for getting the tail of a deref chain	2015-08-31 16:58:20 -07:00
Jason Ekstrand	112c607216	nir/spirv: Actaully add variables to the funciton or shader	2015-08-31 16:58:20 -07:00
Jason Ekstrand	4fa1366392	nir/spirv: Add a vtn_untyped_value helper	2015-08-31 16:58:20 -07:00
Jason Ekstrand	e709a4ebb8	nir/spirv: Use vtn_value in the types code and fix a off-by-one error	2015-08-31 16:58:20 -07:00
Jason Ekstrand	67af6c59f2	nir/types: Add an is_vector_or_scalar helper	2015-08-31 16:58:20 -07:00
Jason Ekstrand	5e6c5e3c8e	nir/spirv: Add support for deref chains	2015-08-31 16:58:20 -07:00
Jason Ekstrand	366366c7f7	nir/types: Add a scalar type constructor	2015-08-31 16:58:20 -07:00
Jason Ekstrand	befecb3c55	nir/spirv: Add support for OpLabel	2015-08-31 16:58:20 -07:00
Jason Ekstrand	399e962d25	nir/spirv: Add support for declaring functions	2015-08-31 16:58:20 -07:00
Jason Ekstrand	ac4d459aa2	nir/types: Add accessors for function parameter/return types	2015-08-31 16:58:20 -07:00
Jason Ekstrand	3a266a18ae	nir/spirv: Add support for declaring variables Deref chains and variable load/store operations are still missing.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	2494055631	nir/spirv: Add support for constants	2015-08-31 16:58:20 -07:00
Jason Ekstrand	2a023f30a6	nir/spirv: Add basic support for types	2015-08-31 16:58:20 -07:00
Jason Ekstrand	5bb94c9b12	nir/types: Add more helpers for creating types	2015-08-31 16:58:20 -07:00
Jason Ekstrand	53bff3e445	glsl/types: Expose the function_param and struct_field structs to C Previously, they were hidden behind a #ifdef __cplusplus so C wouldn't find them. This commit simpliy moves the #ifdef and adds #ifdef's around constructors.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	0db3e4dd72	glsl/types: Add support for function types	2015-08-31 16:58:20 -07:00
Jason Ekstrand	1169fcdb05	glsl: Add GLSL_TYPE_FUNCTION to the base types enums	2015-08-31 16:58:20 -07:00
Jason Ekstrand	b79916dacc	nir/spirv: Rework the way values are added Instead of having functions to add values and set various things, we just have a function that does a few asserts and then returns the value. The caller is then responsible for setting the various fields.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	ac60aba351	nir/spirv: Add stub support for extension instructions	2015-08-31 16:58:20 -07:00
Jason Ekstrand	78eabc6153	REVERT: Add a simple helper program for testing SPIR-V -> NIR translation	2015-08-31 16:58:20 -07:00
Jason Ekstrand	2c585a722d	glsl/compiler: Move the error_no_memory stub to standalone_scaffolding.cpp	2015-08-31 16:58:20 -07:00
Jason Ekstrand	b20d9f5643	nir: Add the start of a SPIR-V to NIR translator At the moment, it can handle the very basics of strings and can ignore debug instructions. It also has basic support for decorations.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	9d92b4fd0e	nir: Import the revision 30 SPIR-V header from Khronos	2015-08-31 16:58:20 -07:00
Jason Ekstrand	0af4bf4d4b	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-08-31 16:30:07 -07:00
Jason Ekstrand	9f9628e9dd	vk/SPIR-V: Pull num_uniform_components out of the NIR shader	2015-08-28 22:31:03 -07:00
Jason Ekstrand	44e6ea74b0	spirv: lower outputs to temporaries	2015-08-28 17:38:41 -07:00
Jason Ekstrand	9cebdd78d8	nir: Add a pass to lower outputs to temporary variables This pass can be used as a helper for NIR producers so they don't have to worry about creating the temporaries themselves.	2015-08-28 17:38:41 -07:00
Jason Ekstrand	5e7c7b2a4e	spirv: Only do a block load if you're actually loading a uniform	2015-08-28 16:17:45 -07:00
Jason Ekstrand	98abed2441	spirv: Use VERTEX_ID_ZERO_BASE for vertex id	2015-08-28 16:08:29 -07:00
Jason Ekstrand	dbc3eb5bb4	vk/compiler: Pass the correct is_scalar value to brw_process_nir	2015-08-28 12:13:17 -07:00
Jason Ekstrand	ea56d0cb1d	glsl/types: Fix up function type hash table insertion	2015-08-28 12:00:25 -07:00
Chad Versace	a2d15ee698	vk/meta: Support stencil in vkCmdCopyImageToBuffer At Crucible commit 12e64a4, fixes the func.depthstencil.stencil-triangles.* tests on Broadwell.	2015-08-28 08:41:21 -07:00
Chad Versace	84cfc08c10	vk/pipeline: Fix crash when the pipeline has no attributes If there are no attributes, don't emit 3DSTATE_VERTEX_ELEMENTS. That packet does not allow 0 attributes.	2015-08-28 08:07:15 -07:00
Chad Versace	053d32d2a5	vk/image: Linear stencil buffers are illegal The hardware requires that stencil buffer memory be W-tiled. From the Sandybridge PRM: This buffer is supported only in Tile W memory.	2015-08-28 08:04:59 -07:00
Chad Versace	14e1d58fb7	vk: Fix stride of stencil buffers Stencil buffers have strange pitch. The PRM says: The pitch must be set to 2x the value computed based on width, as the stencil buffer is stored with two rows interleaved.	2015-08-28 08:03:46 -07:00
Chad Versace	31af126229	vk: Program stencil ops in 3DSTATE_WM_DEPTH_STENCIL The driver ignored the Vulkan stencil, always programming the hardware stencil op to 0 (STENCILOP_KEEP).	2015-08-28 08:00:56 -07:00
Chad Versace	bff2879abe	vk/image: Don't abort when creating stencil image views When creating a stencil image view, log a FINISHME but don't abort. We're sooooo close to having this working.	2015-08-28 07:59:59 -07:00
Chad Versace	4f852c76dc	vk/meta: Save/restore VkDynamicDepthStencilState	2015-08-28 07:59:29 -07:00
Chad Versace	104c4e5ddf	vk/meta: Don't skip clearing when clearing only depth attachment anv_cmd_buffer_clear_attachments() skipped the clear renderpass if no color attachments needed to be cleared, even if a depth attachment needed to be cleared.	2015-08-28 07:58:51 -07:00
Chad Versace	aacb7bb9b6	vk: Add func anv_cmd_buffer_get_depth_stencil_view() This function removes some duplicated code from genN_cmd_buffer_emit_depth_stencil().	2015-08-28 07:57:34 -07:00
Chad Versace	641c25dd55	vk: Declare some local variables as const In anv_cmd_buffer_emit_depth_stencil(), declare 'subpass' and 'fb' as const.	2015-08-28 07:53:24 -07:00
Chad Versace	c6f19b4248	vk: Don't duplicate anv_depth_stencil_view's surface data In anv_depth_stencil_view, replace the members bo depth_offset depth_stride depth_format depth_qpitch stencil_offset stencil_stride stencil_qpitch with the single member const struct anv_image *image The removed members duplicated data in anv_image::depth_surface and anv_image::stencil_surface.	2015-08-28 07:52:19 -07:00
Chad Versace	35b0262a2d	vk/gen7: Add func gen7_cmd_buffer_emit_depth_stencil() This patch moves all the GEN7_3DSTATE_DEPTH_BUFFER code from gen7_cmd_buffer_begin_subpass() into a new function gen7_cmd_buffer_emit_depth_stencil().	2015-08-28 07:46:16 -07:00
Chad Versace	b2ee317e24	vk: Fix format of anv_depth_stencil_view The format of the view itself and of the view's image may differ. Moreover, if the view's format has no depth aspect but the image's format does, we must not program the depth buffer. Ditto for stencil.	2015-08-28 07:44:32 -07:00
Chad Versace	798acb2464	vk/gen7: Fix gen of emitted packet in gen7_batch_lri() Emit GEN7_MI_LOAD_REGISTER_IMM, not the GEN8 version.	2015-08-28 07:36:35 -07:00
Chad Versace	4461392343	vk: Remove dummy anv_depth_stencil_view	2015-08-28 07:35:39 -07:00
Chad Versace	941b48e992	vk/image: Let anv_image have one anv_surface per aspect Split anv_image::primary_surface into two: anv_image::color_surface and depth_surface.	2015-08-28 07:17:54 -07:00
Jason Ekstrand	c313a989b4	spirv: Bump to the public revision 31	2015-08-27 15:24:04 -07:00
Jason Ekstrand	2a8d1ac958	vk: Update to API version 0.138.2	2015-08-27 11:41:04 -07:00
Jason Ekstrand	4e3ee043c0	vk/gen8: Add support for push constants	2015-08-27 10:25:58 -07:00
Jason Ekstrand	375a65d5de	vk/private.h: Handle a NULL bo but valid offset in __gen_combine_address	2015-08-27 10:25:58 -07:00
Jason Ekstrand	c8365c55f5	vk/cmd_buffer: Set the CONSTANTS_REL_GENERAL flag on execbuf This tells the kernel that the push constant buffers are relative to the dynamic state base address.	2015-08-27 10:25:58 -07:00
Jason Ekstrand	efc2cce01f	HACK: Don't call nir_setup_uniforms We're doing our own uniform setup and we don't need to call into the entire GL stack to mess with things.	2015-08-27 10:25:58 -07:00
Jason Ekstrand	33cabeab01	vk/compiler: Add a helper for setting up prog_data->param This new helper sets it up the way we'll want for handling push constants.	2015-08-27 10:25:16 -07:00
Jason Ekstrand	5446bf352e	vk: Add initial API support for setting push constants This doesn't add support for actually uploading them, it just ensures that we have and update the shadow copy.	2015-08-26 17:59:15 -07:00
Jason Ekstrand	36134e1050	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-08-26 11:04:30 -07:00
Jason Ekstrand	74e076bba8	vk/meta: Destroy vertex shaders when setting up clearing	2015-08-25 18:51:26 -07:00
Jason Ekstrand	4bb9915755	vk/gen8: Don't duplicate generic pipeline setup gen8_graphics_pipeline_create had a bunch of stuff in it that's already set up by anv_pipeline_init. The duplication was causing double-initialization of a state stream and made valgrind very angry.	2015-08-25 18:41:25 -07:00
Jason Ekstrand	9b387b5d3f	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-08-25 18:41:21 -07:00
Kristian Høgsberg Kristensen	5360edcb30	vk/vec4: Use the right constant for offset into a UBO We were using constant 0, which is the set. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-25 16:14:59 -07:00
Kristian Høgsberg Kristensen	647a60226d	vk: Use true/false for RenderCacheReadWriteMode This field in surface state is a bool, WriteOnlyCache is an enum from GEN8. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-25 15:58:21 -07:00
Kristian Høgsberg Kristensen	7e5afa75b5	vk: Support descriptor sets and bindings in vec4 ubo loads Still incomplete, but at least we get the simplest case working. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-25 15:57:12 -07:00
Kristian Høgsberg Kristensen	00e7799c69	vk/gen7: Enable L3 caching for GEN7 MOCS Do what GL does here. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-25 15:55:56 -07:00
Kristian Høgsberg Kristensen	6a1098b2c2	vk/gen7: Use TILEWALK_XMAJOR for linear surfaces You wouldn't think the TileWalk mode matters when TiledSurface is false. However, it has to be TILEWALK_XMAJOR. Make it so. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-25 10:54:13 -07:00
Kristian Høgsberg Kristensen	f1455ffac7	vk: Add gen7 support With all the previous commits in place, we can now drop in support for multiple platforms. First up is gen7 (Ivybridge). Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	891995e55b	vk: Move 3DSTATE_SBE setup to just before 3DSTATE_PS This is a more logical place for it, between geometry front end state and pixel backend state. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	9c752b5b38	vk: Move generic pipeline init to anv_pipeline.c This logic will be shared between multiple gens. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	3800573fb5	vk: Move gen8 specific state into gen8 sub-structs This commit moves all occurances of gen8 specific state into a gen8 substruct. This clearly identifies the state as gen8 specific and prepares for adding gen7 state structs. In the process we also rename the field names to exactly match the command or state packet name, without the 3DSTATE prefix, eg: 3DSTATE_VF -> gen8.vf 3DSTATE_WM_DEPTH_STENCIL -> gen8.wm_depth_stencil Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	615da3795a	vk: Always use a placeholder vertex shader in meta The clear pipeline didn't have a vertex shader and relied on the clear shader being hardcoded by the compiler to accept one attribute. This necessitated a few special cases in the 3DSTATE_VS setup. Instead, always provide a vertex shader, even if we disable VS dispatch. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	ac738ada7a	vk: Trim out irrelevant 0-initialized surface state fields Many of of these fields aren't used for buffer surfaces, so leave them out for brevity. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	963a1e35e7	vk: Update generated headers This adds VALIGN_2 and VALIGN_4 defines for IVB and HSW RENDER_SURFACE_STATE. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	f5275f7eb3	vk: Move anv_color_attachment_view_init() to gen8_state.c I'd prefer to move anv_CreateAttachmentView() as well, but it's a little too much generic code to just duplicate for each gen. For now, we'll add a anv_color_attachment_view_init() to dispatch to the gen specific implementation, which we then call from anv_CreateAttachmentView(). Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	988341a73c	vk: Move anv_CreateImageView to gen8_state.c We'll probably want to move some code back into a shared init function, but this gets one GEN8 surface state initialization out of anv_image.c. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	bc568ee992	vk: Make anv_cmd_buffer_begin_subpass() switch on gen Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	8fe74ec45c	vk: Add generic wrapper for filling out buffer surface state We need this for generating surface state on the fly for dynamic buffer views. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	a2b822185e	vk: Add helper for adding surface state reloc We're going to have to do this differently for earlier gens, so lets do it in place only. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	e43fc871be	vk: Make batch chain code gen-agnostic Since the extra dword in MI_BATCH_BUFFER_START added in gen8 is at the end of the struct, we can emit the gen8 packet on all gens as long as we set the instruction length correctly. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	25ab43ee8c	vk: Move vkCmdPipelineBarrier to gen8_cmd_buffer.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	b4ef2302a9	vk: Use helper function for emitting MI_BATCH_BUFFER_START Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	97360ffc6c	vk: Use anv_batch_emit() for chaining back to primary batch We used to use a manual GEN8_MI_BATCH_BUFFER_START_pack() call, but this refactors the code to use anv_batch_emit(); Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	cff717c649	vk: Downgrade state packet to gen7 where they're common Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	64045eebfb	vk: Reorder gen8 specific code into three new files We'll organize gen specific code in three files per gen: pipeline, cmd_buffer and state, eg: gen8_cmd_buffer.c gen8_pipeline.c gen8_state.c where gen8_cmd_buffer.c holds all vkCmd* entry points, gne8_pipeline.c all gen specific code related to pipeline building and remaining state code (sampler, surface state, dynamic state) in gen8_state.c. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	9f0bb5977b	vk: Move gen8_CmdBindIndexBuffer() to anv_gen8.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	a7649b2869	vk: Move gen8_cmd_buffer_emit_state_base_address() to anv_gen8.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	130db30771	vk: Move gen8 specific parts of queries to anv_gen8.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	98126c021f	vk: Move dynamic depth stenctil to anv_gen8.c	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	0bcf85d79f	vk: Move pipeline creation to anv_gen8.c	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	ef0ab62486	vk: Move anv_CreateSampler to anv_gen8.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	fb428727e0	vk: Move anv_CreateBufferView to anv_gen8.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	74556b076a	vk: Add new anv_gen8.c and move CreateDynamicRasterState there Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	ee9788973f	vk: Implement multi-gen dispatch mechanism	2015-08-24 13:45:39 -07:00
Chad Versace	c4e7ed9163	vk/meta: Implement depth clears Fixes Crucible test func.depthstencil.basic-depth.clear-1.0.op-greater.	2015-08-20 10:25:05 -07:00
Chad Versace	0db3d67a14	vk: Cache each render pass's number of clear ops During vkCreateRenderPass, count the number of clear ops and store them in new members of anv_render_pass: uint32_t num_color_clear_attachments bool has_depth_clear_attachment bool has_stencil_clear_attachment Cacheing these 8 bytes (including padding) reduces the number of times that anv_cmd_buffer_clear_attachments needs to loop over the pass's attachments.	2015-08-20 10:25:04 -07:00
Chad Versace	2387219101	vk: Use temp var in vkCreateRenderPass's attachment loop Store the attachment in a temporary variable and s/pass->attachments[i]/att/ .	2015-08-20 10:25:04 -07:00
Chad Versace	1c24a191cd	vk: Improve memory locality of anv_render_pass Allocate the pass's array of attachments, anv_render_pass::attachments, in the same allocation as the pass itself.	2015-08-20 09:31:58 -07:00
Chad Versace	4eaf90effb	vk: Unharcode an argument to sizeof s/struct anv_subpass/pass->subpasses[0])/	2015-08-20 09:31:58 -07:00
Chad Versace	44ef4484c8	vk/meta: Add Z coord to clear vertices For now, the Z coordinate is always 0.0. Will later be used for depth clears.	2015-08-20 09:31:12 -07:00
Chad Versace	4aef5c62cd	vk/meta: Restore all saved state in anv_cmd_buffer_restore() anv_cmd_buffer_restore() did not restore the old VkDynamicColorBlendState.	2015-08-20 09:30:34 -07:00
Chad Versace	9f908fcbde	vk/meta: Use consistent names and types in anv_saved_state In struct anv_saved_state, each member's type was a pointer to an Anvil struct and each member's name was prefixed with "old" except cb_state, which was a Vulkan handle whose name lacked "old".	2015-08-20 09:29:41 -07:00
Neil Roberts	49d9e89d00	Add mesa.icd to the .gitignore Since `4d7e0fa8c7` this file is generated by the configure script. Reviewed-by: Tapani Palli <tapani.palli@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (cherry picked from commit `885762e182`)	2015-08-19 14:12:31 -07:00
Chad Versace	bd0aab9a58	vk/meta: Fix dest format of vkCmdCopyImage The source image's format was incorrectly used for both the source view and destination view. For vkCmdCopyImage to correctly translate formats, the destination view's format must be that of the destination image's.	2015-08-18 12:44:06 -07:00
Chad Versace	b0875aa911	vk: Assert that swap chain format is a color format	2015-08-18 12:43:57 -07:00
Chad Versace	d52822541e	vk/image: Don't set anv_surface_view::offset twice It was set twice a few lines apart, and the second setting always overrode the first.	2015-08-18 11:48:50 -07:00
Chad Versace	e7d3a5df5a	vk/meta: Use anv_format_is_color() That is, replace !anv_format_is_depth_or_stencil() with anv_format_is_color(). That conveys the meaning better.	2015-08-18 11:48:48 -07:00
Chad Versace	50f7bf70da	vk: Add anv_format_is_color()	2015-08-18 11:48:46 -07:00
Chad Versace	6ff95bba8a	vk: Add anv_format reference to anv_render_pass_attachment Change type of anv_render_pass_attachment::format from VkFormat to const struct anv_format*. This elimiates the repetitive lookups into the VkFormat -> anv_format table when looping over attachments during anv_cmd_buffer_clear_attachments().	2015-08-17 14:08:55 -07:00
Chad Versace	5a6b2e6df0	vk/image: Simplify stencil case for anv_image_create() Stop creating a temporary VkImageCreateInfo with overriden format=VK_FORMAT_S8_UINT. Instead, just pass the format override directly to anv_image_make_surface().	2015-08-17 14:08:55 -07:00
Chad Versace	a9c36daa83	vk/formats: Add global pointer to anv_format for S8_UINT Stencil formats are often a special case. To reduce the number of lookups into the VkFormat-to-anv_format translation table when working with stencil, expose the table's entry for VK_FORMAT_S8_UINT as global variable anv_format_s8_uint.	2015-08-17 14:08:55 -07:00
Chad Versace	60c4ac57f2	vk: Add anv_format reference t anv_surface_view Change type of anv_surface_view::format from VkFormat to const struct anv_format*. This reduces the number of lookups in the VkFormat -> anv_format table.	2015-08-17 14:08:55 -07:00
Chad Versace	c11094ec9a	vk: Pass anv_format to anv_fill_buffer_surface_state() This moves the translation of VkFormat to anv_format from anv_fill_buffer_surface_state() to its caller. A prep commit to reduce more VkFormat -> anv_format translations.	2015-08-17 14:08:55 -07:00
Chad Versace	ded736f16a	vk: Add anv_format reference to anv_image Change type of anv_image::format from VkFormat to const struct anv_format*. This reduces the number of lookups in the VkFormat -> anv_format table.	2015-08-17 14:08:55 -07:00
Chad Versace	4ae42c83ec	vk: Store the original VkFormat in anv_format Store the original VkFormat as anv_format::vk_format. This will be used to reduce format indirection, such as lookups into the VkFormat -> anv_format translation table.	2015-08-17 14:07:44 -07:00
Jason Ekstrand	e39e1f4d24	vk: Update .gitignore for the autogenerated spirv changes	2015-08-17 11:47:25 -07:00
Kristian Høgsberg Kristensen	aac6f7c3bb	vk: Drop aub dumper and PCI ID override feature These are now available in intel_aubdump from intel-gpu-tools. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-17 11:41:19 -07:00
Kristian Høgsberg Kristensen	6d09d0644b	vk: Use anv_image_create() for creating dmabuf VkImage We need to make sure we use the VkImage infrastructure for creating dmabuf images. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-17 11:41:19 -07:00
Jason Ekstrand	0deae66eb1	vk: Add an _autogen suffix autogenerated spirv file names This prevents make from stomping on nir_spirv.h	2015-08-17 11:40:16 -07:00
Jason Ekstrand	6a7ca4ef2c	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-08-17 11:25:03 -07:00
Jason Ekstrand	b4c02253c4	vk: Add four unit tests for our lock-free data-structures	2015-08-14 17:04:39 -07:00
Jason Ekstrand	16c5b9f4ed	vk: Build a version of the driver for linking into unit tests	2015-08-14 17:04:39 -07:00
Kristian Høgsberg Kristensen	30d82136bb	vk: Update generated headers This update brings usable IVB/HSW RENDER_SURFACE_STATE structs and adds more float fields that we previously failed to recognize.	2015-08-12 21:05:32 -07:00
Kristian Høgsberg Kristensen	9564dd37a0	vk: Query aperture size up front in anv_physical_device_init() We already query the device in various ways here and we can just also get the aperture size. This avoids keeping an extra drm fd open during the life time of the driver. Also, we need to use explicit 64 bit types for the aperture size, not size_t.	2015-08-10 17:18:55 -07:00
Kristian Høgsberg Kristensen	8605ee60e0	vk: Share upload logic and add size assert This lets us hit an assert if we exceed the block pool size instead of GPU hanging. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-10 17:17:45 -07:00
Jason Ekstrand	6757e2f75c	vk/cmd_buffer: Allow for null VkCmdPool's	2015-08-04 14:01:08 -07:00
Kristian Høgsberg Kristensen	4b097d73e6	vk: Call anv_batch_emit_dwords() up front in anv_batch_emit() This avoids putting a memory barrier between the template struct and the pack function, which generates much better code.	2015-08-03 15:38:14 -07:00
Kristian Høgsberg Kristensen	fbb119061e	vk: Update generated headers This adds zeroing of reserved blocks of dwords and removes an instruction definition.	2015-08-03 15:21:27 -07:00
Jason Ekstrand	facf587dea	vk/allocator: Solve a data race in anv_block_pool The anv_block_pool data structure suffered from the exact same race as the state pool. Namely, that the uniqueness of the blocks handed out depends on the next_block value increasing monotonically. However, this invariant did not hold thanks to our block "return" concept.	2015-08-03 01:19:34 -07:00
Jason Ekstrand	5e5a783530	vk: Add and use an anv_block_pool_size() helper	2015-08-03 01:18:09 -07:00
Jason Ekstrand	56ce219493	vk/allocator: Make block_pool_grow take and return a size It takes the old size as an argument and returns the new size as the return value. On error, it returns a size of 0.	2015-08-03 01:06:45 -07:00
Jason Ekstrand	fd64598462	vk/allocator: Fix a data race in the state pool The previous algorithm had a race because of the way we were using __sync_fetch_and_add for everything. In particular, the concept of "returning" over-allocated states in the "next > end" case was completely bogus. If too many threads were hitting the state pool at the same time, it was possible to have the following sequence: A: Get an offset (next == end) B: Get an offset (next > end) A: Resize the pool (now next < end by a lot) C: Get an offset (next < end) B: Return the over-allocated offset D: Get an offset in which case D will get the same offset as C. The solution to this race is to get rid of the concept of "returning" over-allocated states. Instead, the thread that gets a new block simply sets the next and end offsets directly and threads that over-allocate don't return anything and just futex-wait. Since you can only ever hit the over-allocate case if someone else hit the "next == end" case and hasn't resized yet, you're guaranteed that the end value will get updated and the futex won't block forever.	2015-08-03 00:38:48 -07:00
Jason Ekstrand	481122f4ac	vk/allocator: Make a few things more consistant	2015-08-03 00:35:19 -07:00
Jason Ekstrand	e65953146c	vk/allocator: Use memory pools rather than (MALLOC\|FREE)LIKE We have pools, so we should be using them. Also, I think this will help keep valgrind from getting confused when we have to end up fighting with system allocations such as those from malloc/free and mmap/munmap.	2015-07-31 10:38:28 -07:00
Jason Ekstrand	1920ef9675	vk/allocator: Add an anv_state_pool_finish function Currently this is a no-op but it gives us a place to put finalization things in the future.	2015-07-31 10:38:28 -07:00
Jason Ekstrand	930598ad56	vk/instance: valgrind-guard client-provided allocations	2015-07-31 10:38:23 -07:00
Jason Ekstrand	e40bdcef1f	vk/device: Add anv_instance_alloc/free helpers This way we can more consistently alloc/free the device and it will provide us a better place to put valgrind hooks in the next patch	2015-07-31 10:14:17 -07:00
Jason Ekstrand	0f050aaa15	vk/device: Mark newly allocated memory as undefined for valgrind This way valgrind still works even if the client gives us memory that has been initialized or re-uses memory for some reason.	2015-07-31 09:44:42 -07:00
Jason Ekstrand	1f49a7d9fc	vk/batch_chain: Decrement num_relocs instead of incrementing it	2015-07-31 09:11:47 -07:00
Jason Ekstrand	220a01d525	vk/batch_chain: Compute secondary exec mode after finishing the bo Figuring out whether or not to do a copy requires knowing the length of the final batch_bo. This gets set by anv_batch_bo_finish so we have to do it afterwards. Not sure how this was even working before.	2015-07-31 08:52:30 -07:00
Jason Ekstrand	26ba0ad54d	vk: Re-name command buffer implementation files Previously, the command buffer implementation was split between anv_cmd_buffer.c and anv_cmd_emit.c. However, this naming convention was confusing because none of the Vulkan entrypoints for anv_cmd_buffer were actually in anv_cmd_buffer.c. This changes it so that anv_cmd_buffer.c is what you think it is and the internals are in anv_batch_chain.c.	2015-07-30 15:00:42 -07:00
Jason Ekstrand	e379cd9a0e	vk/cmd_buffer: Add a simple command pool implementation	2015-07-30 14:55:49 -07:00
Jason Ekstrand	4c2a182a36	vk/cmd_buffer: Add support for zero-copy batch chaining	2015-07-30 14:22:17 -07:00
Jason Ekstrand	21004f23bf	vk: Add initial support for secondary command buffers	2015-07-30 11:36:48 -07:00
Jason Ekstrand	5aee803b97	vk/cmd_buffer: Split batch chaining into a helper function	2015-07-30 11:34:58 -07:00
Jason Ekstrand	0c4a2dab7e	vk/device: Make BATCH_SIZE a global #define	2015-07-30 11:34:09 -07:00
Jason Ekstrand	ace093031d	vk/cmd_buffer: Add functions for cloning a list of anv_batch_bo's We'll need this to implement secondary command buffers.	2015-07-30 11:32:27 -07:00
Jason Ekstrand	7af67e085f	vk/reloc_list: Actually set the new length in reloc_list_grow	2015-07-30 11:29:55 -07:00
Jason Ekstrand	f15be18c92	util/list: Add list splicing functions This adds functions for splicing one list into another. These have more-or-less the same API as the kernel list splicing functions.	2015-07-30 11:28:22 -07:00
Jason Ekstrand	e39d0b635c	CLONE	2015-07-30 08:24:02 -07:00
Jason Ekstrand	82548a3aca	vk/cmd_buffer: Invalidate texture cache in emit_state_base_address Previously, the caller of emit_state_base_address was doing this. However, putting it directly in emit_state_base_address means that we'll never forget the flush at the cost of one PIPE_CONTROL at the top every batch (that should do nothing since the kernel just flushed for us).	2015-07-30 08:24:02 -07:00
Jason Ekstrand	56ce896d73	vk/cmd_buffer: Rename emit_batch_buffer_end to end_batch_buffer This is more generic and doesn't imply that it emits MI_BATCH_BUFFER_END. While we're at it, we'll move NOOP adding from bo_finish to end_batch_buffer.	2015-07-30 08:24:02 -07:00
Jason Ekstrand	3ed9cea84d	vk/cmd_buffer: Use an array to track all know anv_batch_bo objects Instead of walking the list of batch and surface buffers, we simply keep track of all known batch and surface buffers as we build the command buffer. Then we use this new list to construct the validate list.	2015-07-29 15:30:15 -07:00
Jason Ekstrand	0f31c580bf	vk/cmd_buffer: Rework validate list creation The algorighm we used previously required us to call add_bo in a particular order in order to guarantee that we get the initial batch buffer as the last element in the validate list. The new algorighm does a recursive walk over the buffers and then re-orders the list. This should be much more robust as we start to add circular dependancies in the relocations.	2015-07-29 15:16:54 -07:00
Jason Ekstrand	4fc7510a7c	vk/cmd_buffer: Move emit_batch_buffer_end higher in the file	2015-07-29 12:01:08 -07:00
Jason Ekstrand	8208f01a35	vk/cmd_buffer: Store the relocation list in the anv_batch_bo struct Before, we were doing this thing where we had one big relocation list for the whole command buffer and each subbuffer took a chunk out of it. Now, we store the actual relocation list in the anv_batch_bo. This comes at the cost of more small allocations but makes a lot of things simpler.	2015-07-29 12:01:08 -07:00
Jason Ekstrand	7d50734240	vk/batch: Make relocs a pointer to a relocation list Previously anv_batch.relocs was an actual relocation list. However, this is limiting if the implementation of the batch wants to change the relocation list as the batch progresses.	2015-07-29 12:01:08 -07:00
Kristian Høgsberg Kristensen	fcea3e2d23	vk/headers: Update to new generated gen headers This update fixes cases where a 48-bit address field was split into two parts: __gen_address_type MemoryAddress; uint32_t MemoryAddressHigh; which cases this pack code to be generated: dw[1] = __gen_combine_address(data, &dw[1], values->MemoryAddress, dw1); dw[2] = __gen_field(values->MemoryAddressHigh, 0, 15) \| 0; which breaks for addresses above 4G. This update also fixes arrays of structs in commands and structs, for example, we now have: struct GEN8_BLEND_STATE_ENTRY Entry[8]; and the pack functions now write all dwords in the packet, making valgrind happy. Finally, we would try to pack 64 bits of blend state into a uint32_t - that's also fixed now.	2015-07-29 11:02:33 -07:00
Jason Ekstrand	65f3d00cd6	vk/cmd_buffer: Update a comment	2015-07-29 08:33:56 -07:00
Jason Ekstrand	86a53d2880	vk/cmd_buffer: Use a doubly-linked list for batch and surface buffers This is probably better than hand-rolling the list of buffers.	2015-07-28 17:47:59 -07:00
Jason Ekstrand	6aba52381a	vk/aub: Use the data directly from the execbuf2 Previously, we were crawling through the anv_cmd_buffer datastructure to pull out batch buffers and things. This meant that every time something in anv_cmd_buffer changed, we broke aub dumping. However, aub dumping should just dump the stuff the kernel knows about so we really don't need to be crawling driver internals.	2015-07-28 16:53:45 -07:00
Jason Ekstrand	3c2743dcd1	vk/cmd_buffer: Pull the execbuf stuff into a substruct	2015-07-27 16:37:09 -07:00
Jason Ekstrand	4ced8650d4	vk/cmd_buffer: Move the remaining entrypoints into cmd_emit.c	2015-07-27 15:14:31 -07:00
Jason Ekstrand	d4c249364d	vk/cmd_buffer: Move the re-emission of STATE_BASE_ADDRESS to the flushing code This used to happen magically in cmd_buffer_new_surface_state_bo. However, according to Ken, STATE_BASE_ADDRESS is very gen-specific so we really shouldn't have it in the generic data-structure code.	2015-07-27 15:05:06 -07:00
Jason Ekstrand	117d74b4e2	vk/cmd_buffer: Factor the guts of CmdBufferEnd into two helpers	2015-07-27 14:52:16 -07:00
Jason Ekstrand	8fb6405718	vk/cmd_buffer: Factor the guts of (Create\|Reset\|Destroy)CmdBuffer into helpers	2015-07-27 14:23:56 -07:00
Jason Ekstrand	80ad578c4e	vk/private.h: Re-arrange and better comment anv_cmd_buffer	2015-07-27 12:40:43 -07:00
Jason Ekstrand	50e86b5777	vk: Actually advertise 0.138.1 at runtime	2015-07-23 10:44:27 -07:00
Jason Ekstrand	f884b500d0	vk/vulkan.h: Bump to the version 0.138.1 header This doesn't actually require any implementation changes but it does change an enum so it is ABI-incompatable with 0.138.0.	2015-07-23 10:38:22 -07:00
Jason Ekstrand	e99773badd	vk: Add two more valgrind checks	2015-07-23 08:57:54 -07:00
Jason Ekstrand	b1fcc30ff0	vk/meta: Destroy shader modules	2015-07-22 17:51:26 -07:00
Jason Ekstrand	3460e6cb2f	vk/device: Finish the scratch block pool on device destruction	2015-07-22 17:51:14 -07:00
Jason Ekstrand	867f6cb90c	vk: Add a FreeDescriptorSets function	2015-07-22 17:33:09 -07:00
Jason Ekstrand	c9dc1f4098	vk/pipeline: Be more sloppy about shader entrypoint names The CTS passes in NULL names right now. It's not too hard to support that as just "main". With this, and a patch to vulkancts, we now pass all 6 tests.	2015-07-22 15:26:56 -07:00
Chad Versace	2c2233e328	vk: Prefix most filenames with anv Jason started the task by creating anv_cmd_buffer.c and anv_cmd_emit.c. This patch finishes the task by renaming all other files except gen*_pack.h and glsl_scraper.py.	2015-07-17 20:25:38 -07:00
Chad Versace	f70d079854	vk/image: Remove unneeded data from anv_buffer_view This completes the FINISHME to trim unneeded data from anv_buffer_view. A VkExtent3D doesn't make sense for a VkBufferView. So remove the member anv_surface_view::extent, and push it up to the two objects that actually need it, anv_image_view and anv_attachment_view.	2015-07-17 14:48:23 -07:00
Chad Versace	194b77d426	vk: Document members of anv_surface_view	2015-07-17 14:39:05 -07:00
Chad Versace	169251bff0	vk: Remove more raw casts This removes nearly all the remaining raw Anvil<->Vulkan casts from the C source files. (File compiler.cpp still contains many raw casts, and I plan on ignoring that). As far as I can tell, the only remaining raw casts are: anv_attachment_view -> anv_depth_stencil_view anv_attachment_view -> anv_color_attachment_view	2015-07-17 14:32:22 -07:00
Chad Versace	fc3838376b	vk/image: Add braces around multi-line ifs	2015-07-17 13:38:09 -07:00
Connor Abbott	b2cfd85060	nir/spirv: don't declare builtin blocks They aren't used, and the backend was barfing on them. Also, remove a hack in in compiler.cpp now that they're gone.	2015-07-16 11:04:22 -07:00
Connor Abbott	b599735be4	nir/spirv: add support for loading UBO's We directly emit ubo load intrinsics based off of the offset information handed to us from SPIR-V.	2015-07-16 10:54:09 -07:00
Connor Abbott	513ee7fa48	nir/types: add more nir_type_is_xxx() wrappers	2015-07-15 21:58:32 -07:00
Connor Abbott	9fa0989ff2	nir: move to two-level binding model for UBO's The GLSL layer above is still hacky, so we're really just moving the hack into GLSL-to-NIR. I'd rather not go all the way and make GLSL support the Vulkan binding model too, since presumably we'll be switching to SPIR-V exclusively, and so working on proper GLSL support will be a waste of time. For now, doing this keeps it working as we add SPIR-V->NIR support though.	2015-07-15 17:18:48 -07:00
Chad Versace	5520221118	vk: Remove unneeded vulkan-138.h	2015-07-15 17:16:07 -07:00
Chad Versace	73a8f9543a	vk: Bump vulkan.h version to 0.138	2015-07-15 17:16:07 -07:00
Chad Versace	55781f8d02	vk/0.138: Update VkResult values	2015-07-15 17:16:07 -07:00
Chad Versace	756d8064c1	vk/0.132: Do type-safety	2015-07-15 17:16:07 -07:00
Jason Ekstrand	927f54de68	vk/cmd_buffer: Move batch buffer padding to anv_batch_bo_finish()	2015-07-15 17:11:04 -07:00
Jason Ekstrand	9c0db9d349	vk/cmd_buffer: Rename bo_count to exec2_bo_count	2015-07-15 16:56:29 -07:00
Jason Ekstrand	6037b5d610	vk/cmd_buffer: Add a helper for allocating dynamic state This matches what we do for surface state and makes the dynamic state pool more opaque to things that need to get dynamic state.	2015-07-15 16:56:29 -07:00
Jason Ekstrand	7ccc8dd24a	vk/private.h: Move cmd_buffer functions to near the cmd_buffer struct	2015-07-15 16:56:29 -07:00
Jason Ekstrand	d22d5f25fc	vk: Split command buffer state into its own structure Everything else in anv_cmd_buffer is the actual guts of the datastructure.	2015-07-15 16:56:29 -07:00
Jason Ekstrand	da4d9f6c7c	vk: Move most of the anv_Cmd related stuff to its own file	2015-07-15 16:56:28 -07:00
Jason Ekstrand	d862099198	vk: Pull the guts of anv_cmd_buffer into its own file	2015-07-15 16:56:28 -07:00
Chad Versace	498ae009d3	vk/glsl: Replace raw casts Needed for upcoming type-safety changes.	2015-07-15 15:51:37 -07:00
Chad Versace	6f140e8af1	vk/meta: Remove raw casts Needed for upcoming type-safety changes.	2015-07-15 15:51:37 -07:00
Chad Versace	badbf0c94a	vk/x11: Remove raw casts The raw casts in the WSI functions will break the build when the type-safety changes arrive.	2015-07-15 15:49:10 -07:00
Chad Versace	61a4bfe253	vk: Delete vkDbgSetObjectTag() Because VkObject is going away.	2015-07-15 15:34:20 -07:00
Jason Ekstrand	e1c78ebe53	vk/device: Remove unneeded checks for NULL	2015-07-15 15:22:32 -07:00
Jason Ekstrand	f4748bff59	vk/device: Provide proper NULL handling in anv_device_free The Vulkan spec does not specify that the free function provided to CreateInstance must handle NULL properly so we do it in the wrapper. If this ever changes in the spec, we can delete the extra 2 lines.	2015-07-15 15:22:32 -07:00
Chad Versace	4c8e1e5888	vk: Stop internally calling anv_DestroyObject() Replace each anv_DestroyObject() with anv_DestroyFoo(). Let vkDestroyObject() live for a while longer for Crucible's sake.	2015-07-15 15:11:16 -07:00
Chad Versace	f5ad06eb78	vk: Fix vkDestroyObject dispatch for VkRenderPass It called anv_device_free() instead of anv_DestroyRenderPass().	2015-07-15 15:07:41 -07:00
Chad Versace	188f2328de	vk: Fix vkCreate/DestroyRenderPass While updating vkDestroyObject, I discovered that vkDestroyPass reliably crashes. That hasn't been an issue yet, though, because it is never called. In vkCreateRenderPass: - Don't allocate empty attachment arrays. - Ensure that pointers to empty attachment arrays are NULL. - Store VkRenderPassCreateInfo::subpassCount as anv_render_pass::subpass_count. In vkDestroyRenderPass: - Fix loop bounds: s/attachment_count/subpass_count/ - Don't call anv_device_free on null pointers.	2015-07-15 15:07:41 -07:00
Chad Versace	c6270e8044	vk: Refactor create/destroy code for anv_descriptor_set Define two new functions: anv_descriptor_set_create anv_descriptor_set_destroy	2015-07-15 14:31:22 -07:00
Chad Versace	365d80a91e	vk: Replace some raw casts with safe casts That is, replace some instances of (VkFoo) foo with anv_foo_to_handle(foo)	2015-07-15 14:00:21 -07:00
Chad Versace	7529e7ce86	vk: Correct anv_CreateShaderModule's prototype s/VkShader/VkShaderModule/ :sigh: I look forward to type-safety.	2015-07-15 13:59:47 -07:00
Chad Versace	8213be790e	vk: Define struct anv_image_view, anv_buffer_view Follow the pattern of anv_attachment_view. We need these structs to implement the type-safety that arrived in the 0.132 header.	2015-07-15 12:19:29 -07:00
Chad Versace	43241a24bc	vk/meta: Fix declared type of a shader module s/VkShader/VkShaderModule/ I'm looking forward to a type-safe vulkan.h ;)	2015-07-15 11:49:37 -07:00
Chad Versace	94e473c993	vk: Remove struct anv_object Trivial removal because vkDestroyObject() no longer uses it.	2015-07-15 11:29:43 -07:00
Jason Ekstrand	e375f722a6	vk/device: More documentation on surface state flushing	2015-07-15 11:09:02 -07:00
Connor Abbott	9aabe69028	vk/device: explain why a flush is necessary Jason found this from experimenting, but the docs give a reasonable explanation of why it's necessary.	2015-07-14 23:03:19 -07:00
Chad Versace	5f46c4608f	vk: Fix indentation of anv_dynamic_cb_state	2015-07-14 18:19:10 -07:00
Chad Versace	0eeba6b80c	vk: Add finishmes for VkDescriptorPool VkDescriptorPool is a stub object. As a consequence, it's impossible to free descriptor set memory.	2015-07-14 18:19:00 -07:00
Jason Ekstrand	2b5a4dc5f3	vk: Add vulkan-138 and remove vulkan-0.132 Now, 138 is the target and not 132. Once object destruction is finished, we can delete 138 as it will be identical to vulkan.h	2015-07-14 17:54:13 -07:00
Jason Ekstrand	1f658bed70	vk/device: Add stub support for command pools Real support isn't really that far away. We just need a data structure with a linked list and a few tests.	2015-07-14 17:40:00 -07:00
Jason Ekstrand	ca7243b54e	vk/vulkan.h: Add the stuff for cross-queue resource sharing We only have one queue, so this is currently a no-op on our implementation.	2015-07-14 17:20:50 -07:00
Jason Ekstrand	553b4434ca	vk/vulkan.h: Add a couple of size fields for specialization constants	2015-07-14 17:12:39 -07:00
Jason Ekstrand	e5db209d54	vk/vulkan.h: Move around buffer image granularities	2015-07-14 17:10:37 -07:00
Jason Ekstrand	c7fcfebd5b	vk: Add stubs for all the sparse resource stuff	2015-07-14 17:06:11 -07:00
Jason Ekstrand	2a9136feb4	vk/image: Add a stub for the new ImageFormatProperties function This lets the client query about things like multisample. We don't do multisample right now, so I'll let Chad deal with that when he gets to it.	2015-07-14 17:05:30 -07:00
Jason Ekstrand	2c4dc92f40	vk/vulkan.h: Rename FormatInfo to FormatProperties	2015-07-14 17:04:46 -07:00
Jason Ekstrand	d7f44852be	vk/vulkan.h: Re-order some #define's	2015-07-14 16:41:39 -07:00
Jason Ekstrand	1fd3bc818a	vk/vulkan.h: Rename a function parameter	2015-07-14 16:39:01 -07:00
Jason Ekstrand	2e2f48f840	vk: Remove abreviations	2015-07-14 16:34:31 -07:00
Jason Ekstrand	02db21ae11	vk: Add the new extension/layer enumeration entrypoints	2015-07-14 16:11:21 -07:00
Jason Ekstrand	a463eacb8f	vk/vulkan.h: Change maxAnisotropy to a float	2015-07-14 15:04:11 -07:00
Jason Ekstrand	98957b18d2	vk/vulkan.h: Add the VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT flag	2015-07-14 15:03:39 -07:00
Jason Ekstrand	a35811d086	vk/vulkan.h: Rename a couple of function parameters No functional change.	2015-07-14 15:03:01 -07:00
Jason Ekstrand	55723e97f1	vk: Split the memory requirements/binding functions	2015-07-14 14:59:39 -07:00
Jason Ekstrand	ccb2e5cd62	vk: Make barriers more precise (rev. 133)	2015-07-14 14:50:35 -07:00
Jason Ekstrand	30445f8f7a	vk: Split the dynamic state binding function into one per state	2015-07-14 14:26:10 -07:00
Jason Ekstrand	d2c0870ff3	vk/vulkan.h: Rename a function parameter to match 132	2015-07-14 14:11:04 -07:00
Jason Ekstrand	8478350992	vk: Implement Multipass	2015-07-14 11:37:14 -07:00
Jason Ekstrand	68768c40be	vk/vulkan.h: Re-arrange some enums and definitions in preparation for 131	2015-07-14 11:32:15 -07:00
Chad Versace	66cbb7f76d	vk/0.132: Add vkDestroyRenderPass()	2015-07-14 11:21:31 -07:00
Chad Versace	6d0ed38db5	vk/0.132: Add vkDestroy*View() vkDestroyColorAttachmentView vkDestroyDepthStencilView These functions are not in the 0.132 header, but adding them will help us attain the type-safety API updates more quickly.	2015-07-14 11:19:22 -07:00
Chad Versace	1ca611cbad	vk/0.132: Add vkDestroyCommandBuffer()	2015-07-14 11:11:41 -07:00
Chad Versace	6eec0b186c	vk/0.132: Add vkDestroyImageView() Just declare it in vulkan.h. Jason defined the function earlier in image.c.	2015-07-14 11:09:14 -07:00
Chad Versace	4b2c5a98f0	vk/0.132: Add vkDestroyBufferView() Just declare it in vulkan.h. Jason already defined the function earlier in vulkan.c.	2015-07-14 11:06:57 -07:00
Chad Versace	08f7731f67	vk/0.132: Add vkDestroyFramebuffer()	2015-07-14 10:59:30 -07:00
Chad Versace	0c8456ef1e	vk/0.132: Add vkDestroyDynamicDepthStencilState()	2015-07-14 10:54:51 -07:00
Chad Versace	b29c929e8e	vk/0.132: Add vkDestroyDynamicColorBlendState()	2015-07-14 10:52:45 -07:00
Chad Versace	5e1737c42f	vk/0.132: Add vkDestroyDynamicRasterState()	2015-07-14 10:51:08 -07:00
Chad Versace	d80fea1af6	vk/0.132: Add vkDestroyDynamicViewportState()	2015-07-14 10:42:45 -07:00
Chad Versace	9250e1e9e5	vk/0.132: Add vkDestroyDescriptorPool()	2015-07-14 10:38:22 -07:00
Chad Versace	f925ea31e7	vk/0.132: Add vkDestroyDescriptorSetLayout()	2015-07-14 10:36:49 -07:00
Chad Versace	ec5e2f4992	vk/0.132: Add vkDestroySampler()	2015-07-14 10:34:00 -07:00
Chad Versace	a684198935	vk/0.132: Add vkDestroyPipelineLayout()	2015-07-14 10:29:47 -07:00
Chad Versace	6e5ab5cf1b	vk/0.132: Add vkDestroyPipeline()	2015-07-14 10:26:17 -07:00
Chad Versace	114015321e	vk/0.132: Add vkDestroyPipelineCache()	2015-07-14 10:19:27 -07:00
Chad Versace	cb57bff36c	vk/0.132: Add vkDestroyShader()	2015-07-14 10:16:22 -07:00
Chad Versace	8ae8e14ba7	vk/0.132: Add vkDestroyShaderModule()	2015-07-14 10:13:09 -07:00
Chad Versace	dd67c134ad	vk/0.132: Add vkDestroyImage() We only need to add it to vulkan.h because Jason defined the function earlier in image.c.	2015-07-14 10:13:00 -07:00
Chad Versace	e18377f435	vk/0.132: Dispatch vkDestroyObject to new destructors Oops. My recent commits added new destructors, but forgot to teach vkDestroyObject about them. They are: vkDestroyFence vkDestroyEvent vkDestroySemaphore vkDestroyQueryPool vkDestroyBuffer	2015-07-14 09:58:22 -07:00
Chad Versace	e93b6d8eb1	vk/0.132: Add vkDestroyBuffer()	2015-07-14 09:47:45 -07:00
Chad Versace	584cb7a16f	vk/0.132: Add vkDestroyQueryPool()	2015-07-14 09:44:58 -07:00
Chad Versace	68c7ef502d	vk/0.132: Add vkDestroyEvent()	2015-07-14 09:33:47 -07:00
Chad Versace	549070b18c	vk/0.132: Add vkDestroySemaphore()	2015-07-14 09:31:34 -07:00
Chad Versace	ebb191f145	vk/0.132: Add vkDestroyFence()	2015-07-14 09:29:35 -07:00
Chad Versace	435ccf4056	vk/0.132: Rename VkDynamic*State types sed -i -e 's/VkDynamicVpState/VkDynamicViewportState/g' \ -e 's/VkDynamicRsState/VkDynamicRasterState/g' \ -e 's/VkDynamicCbState/VkDynamicColorBlendState/g' \ -e 's/VkDynamicDsState/VkDynamicDepthStencilState/g' \ $(git ls-files include/vulkan src/vulkan)	2015-07-13 16:19:28 -07:00
Connor Abbott	ffb51fd112	nir/spirv: update to SPIR-V revision 31 This means that now the internal version of glslangValidator is required. This includes some changes due to the sampler/texture rework, but doesn't actually enable anything more yet. We also don't yet handle UBO's correctly, and don't handle matrix stride and row major/column major yet.	2015-07-13 15:01:01 -07:00
Chad Versace	45f8723f44	vk/0.132: Move VkQueryControlFlags	2015-07-13 13:09:32 -07:00
Chad Versace	180c07ee50	vk/0.132: Move VkImageAspectFlags	2015-07-13 13:08:56 -07:00
Chad Versace	4b05a8cd31	vk/0.132: Move VkCmdBufferOptimizeFlags	2015-07-13 13:08:07 -07:00
Chad Versace	f1cf55fae6	vk/0.132: Move VkWaitEvent	2015-07-13 13:06:53 -07:00
Chad Versace	3112098776	vk/0.132: Move VkCmdBufferLevel	2015-07-13 13:06:33 -07:00
Chad Versace	c633ab5822	vk/0.132: Drop VK_ATTACHMENT_STORE_OP_RESOLVE_MSAA	2015-07-13 13:05:24 -07:00
Chad Versace	8f3b2187e1	vk/0.132: Rename bool32_t -> VkBool32 sed -i 's/bool32_t/VkBool32/g' \ $(git ls-files src/vulkan include/vulkan)	2015-07-13 13:03:36 -07:00
Chad Versace	77dcfe3c70	vk/0.132: Remove stray typedef	2015-07-13 12:58:17 -07:00
Chad Versace	601d0891a6	vk/0.132: Move VKImageUsageFlags	2015-07-13 12:48:44 -07:00
Chad Versace	829810fa27	vk/0.132: Move VkImageType and VkImageTiling	2015-07-13 11:49:56 -07:00
Chad Versace	17c8232ecf	vk/0.132: Import the 0.132 header Import it as vulkan-0.132.h.	2015-07-13 11:47:12 -07:00
Chad Versace	a158ff55f0	vk/vulkan.h: Remove headers for old API versions Remove the temporary headers for 0.90 and 0.130.	2015-07-13 11:46:30 -07:00
Chad Versace	1c4238a8e5	vk/0.130: Bump header version to 0.130 All APIs have been updated. This eliminates the diff between the work-in-progress header and the 0.130 header.	2015-07-10 20:06:09 -07:00
Chad Versace	f43a304dc6	vk/0.130: Update vkAllocMemory to use VkMemoryType	2015-07-10 17:35:52 -07:00
Chad Versace	df2a013881	vk/0.130: Implement vkGetPhysicalDeviceMemoryProperties()	2015-07-10 17:35:52 -07:00
Chad Versace	c7f512721c	vk/gem: Change signature of anv_gem_get_aperture() Replace the anv_device parameter with anv_physical_device, because this needs querying before vkCreateDevice.	2015-07-10 17:35:52 -07:00
Chad Versace	8cda3e9b1b	vk/device: Add member anv_physical_device::fd During anv_physical_device_init(), we opend the DRM device to do some queries, then promptly closed it. Now we keep it open for the lifetime of the anv_physical_device so that we can query it some more during vkGetPhysicalDevice*Properties() [which will happen in follow-up commits].	2015-07-10 17:35:52 -07:00
Chad Versace	4422bd4cf6	vk/device: Add func anv_physical_device_finish() Because in a follow-up patch I need to do some non-trival teardown on anv_physical_device. Currently, however, anv_physical_device_finish() is currently a no-op that's just called in the right place. Also, rename function fill_physical_device -> anv_physical_device_init for symmetry.	2015-07-10 17:35:52 -07:00
Jason Ekstrand	7552e026da	vk/device: Add an explicit destructor for RenderPass	2015-07-10 12:33:04 -07:00
Jason Ekstrand	8b342b39a3	vk/image: Add an explicit DestroyImage function	2015-07-10 12:30:58 -07:00
Jason Ekstrand	b94b8dfad5	vk/image: Add explicit constructors for buffer/image view types	2015-07-10 12:26:31 -07:00
Jason Ekstrand	18340883e3	nir: Add C++ versions of NIR_(SRC\|DEST)_INIT	2015-07-10 11:57:33 -07:00
Chad Versace	9e64a2a8e4	mesa: Fix generation of git_sha1.h.tmp for gitlinks Don't assume that $(top_srcdir)/.git is a directory. It may be a gitlink file [1] if $(top_srcdir) is a submodule checkout or a linked worktree [2]. [1] A "gitlink" is a text file that specifies the real location of the gitdir. [2] Linked worktrees are a new feature in Git 2.5. Cc: "10.6, 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (cherry picked from commit `75784243df`)	2015-07-10 11:24:25 -07:00
Jason Ekstrand	19f0a9b582	vk/query.c: Use the casting functions	2015-07-09 20:32:44 -07:00
Jason Ekstrand	6eb221c884	vk/pipeline.c: Use the casting functions	2015-07-09 20:28:08 -07:00
Jason Ekstrand	fb4e2195ec	vk/formats.c: Use the casting functions	2015-07-09 20:24:17 -07:00
Jason Ekstrand	a52e208203	vk/image.c: Use the casting functions	2015-07-09 20:24:07 -07:00
Jason Ekstrand	b1de1d4f6e	vk/device.c: One more use of a casting function	2015-07-09 20:23:46 -07:00
Jason Ekstrand	8739e8fbe2	vk/meta.c: Use the casting functions	2015-07-09 20:16:13 -07:00
Jason Ekstrand	92556c77f4	vk: Fix the build	2015-07-09 18:59:08 -07:00
Jason Ekstrand	098209eedf	device.c: Use the cast helpers a bunch of places	2015-07-09 18:49:43 -07:00
Jason Ekstrand	73f9187e33	device.c: Use the cast helpers	2015-07-09 18:41:27 -07:00
Jason Ekstrand	7d24fab4ef	vk/private.h: Add a bunch of static inline casting functions We will need these as soon as we turn on type saftey. We might as well define and start using them now rather than later.	2015-07-09 18:40:54 -07:00
Jason Ekstrand	5c49730164	vk/device.c: Fix whitespace issues	2015-07-09 18:20:28 -07:00
Jason Ekstrand	c95f9b61f2	vk/device.c: Use ANV_FROM_HANDLE a bunch of places	2015-07-09 18:20:10 -07:00
Jason Ekstrand	335e88c8ee	vk/vulkan.h: Add the pEnabledFeatures field to DeviceCreateInfo	2015-07-09 16:21:31 -07:00
Jason Ekstrand	34871cf7f3	vk/vulkan.h: Change the MsCreateInfo structure to the 130 version We do nothing with it at the moment, so this is a no-op.	2015-07-09 16:19:54 -07:00
Jason Ekstrand	8c2c37fae7	vk: Remove the old GetPhysicalDeviceInfo call	2015-07-09 16:14:37 -07:00
Jason Ekstrand	1f907011a3	vk: Add the new PhysicalDeviceQueue queries	2015-07-09 16:14:37 -07:00
Jason Ekstrand	977a469bce	vk: Support GetPhysicalDeviceProperties	2015-07-09 16:14:37 -07:00
Jason Ekstrand	65e0b304b6	vk: Add support for GetPhysicalDeviceLimits	2015-07-09 16:14:37 -07:00
Jason Ekstrand	f6d51f3fd3	vk: Add GetPhysicalDeviceFeatures	2015-07-09 16:14:37 -07:00
Chad Versace	5b75dffd04	vk/device: Fix vkEnumeratePhysicalDevices() The Vulkan spec says that pPhysicalDeviceCount is an out parameter if pPhysicalDevices is NULL; otherwise it's an inout parameter. Mesa incorrectly treated it unconditionally as an inout parameter, which could have lead to reading unitialized data.	2015-07-09 15:53:21 -07:00
Chad Versace	fa915b661d	vk/device: Move device enumeration to vkEnumeratePhysicalDevices() Don't enumerate devices in vkCreateInstance(). That's where global, device-independent initialization should happen. Move device enumeration to the more logical location, vkEnumeratePhysicalDevices().	2015-07-09 15:41:17 -07:00
Chad Versace	c34d314db3	vk/device: Be consistent about path to DRM device Function fill_physical_device() has a 'path' parameter, and struct anv_physical_device has a 'path' member. Sometimes these are used; sometimes hardcoded "/dev/dri/renderD128" is used instead. Be consistent. Hardcode "/dev/dri/renderD128" in exactly one location, during initialization of the physical device.	2015-07-09 15:27:26 -07:00
Connor Abbott	cff06bbe7d	vk/compiler: create an empty parameters list Prevents problems when initializing the sanity_param_count.	2015-07-09 14:29:23 -04:00
Connor Abbott	3318a86d12	nir/spirv: fix wrong writemask for ALU operations	2015-07-09 14:28:39 -04:00
Connor Abbott	b8fedc19f5	nir/spirv: fix memory context for builtin variable Fixes valgrind errors with func.depthstencil.basic.	2015-07-08 22:03:30 -04:00
Connor Abbott	e4292ac039	nir/spirv: zero out value array Before values are pushed or annotated with a name, decoration, etc., they need to have an invalid type, NULL name, NULL decoration, etc. ralloc zero's everything by accident, so this wasn't an issue in practice, but we should be explicitly zero'ing it.	2015-07-08 22:03:30 -04:00
Connor Abbott	997831868f	vk/compiler: create the right kind of program struct This fixes Valgrind errors and gets all the tests to pass with --use-spir-v.	2015-07-08 22:03:30 -04:00
Connor Abbott	a841e2c747	vk/compiler: mark inputs/outputs as read/written This doesn't handle inputs and outputs larger than a vec4, but we plan to add a varyiing splitting/packing pass to handle those anyways.	2015-07-08 22:03:30 -04:00
Jason Ekstrand	8640dc12dc	vk/vulkan.h: Copy the VkStructureType enum from version 130 We now have the exact same structs which require pType.	2015-07-08 17:45:52 -07:00
Jason Ekstrand	5a4ebf6bc1	vk: Move to the new pipeline creation API's	2015-07-08 17:30:18 -07:00
Chad Versace	4fcb32a17d	vk/0.130: Remove VkImageViewCreateInfo::minLod It's now set solely through VkSampler.	2015-07-08 14:48:22 -07:00
Jason Ekstrand	367b9ba78f	vk/vulkan.h: Move renderPassContinue from GraphicsBeginInfo to BeginInfo	2015-07-08 14:37:30 -07:00
Jason Ekstrand	d29ec8fa36	vk/vulkan.h: Update to the new UpdateDescriptorSets api	2015-07-08 14:24:56 -07:00
Jason Ekstrand	c8577b5f52	vk: Add a macro for creating anv variables from vulkan handles This is very helpful for doing the mass bunch of casts at the top of a function. It will also be invaluable when we get type saftey in the API.	2015-07-08 14:24:14 -07:00
Chad Versace	ccb27a002c	vk/0.130 Update VkObjectType values Don't import any new enum tokens from the 0.130 header. Just update the values of existing enums. This reduces the diff by about 16 lines.	2015-07-08 12:53:49 -07:00
Chad Versace	8985dd15a1	vk/0.130: Remove VkDescriptorUpdateMode Nowhere used.	2015-07-08 12:51:46 -07:00
Chad Versace	e02dfa309a	vk/0.130: Remove VK_DEVICE_CREATE_MULTI_DEVICE_IQ_MATCH_BIT	2015-07-08 12:49:48 -07:00
Chad Versace	e9034ed875	vk/0.130: Update vkCmdBlitImage signature Add VkTexFilter param. Ignored for now.	2015-07-08 12:47:48 -07:00
Jason Ekstrand	aae45ab583	vk/vulkan.h: Add packing parameters to BufferImageCopy	2015-07-08 11:51:34 -07:00
Chad Versace	b4ef7f354b	vk/0.130: Remove msaa members of VkDepthStencilViewCreateInfo	2015-07-08 11:50:51 -07:00
Jason Ekstrand	522ab835d6	vk/vulkan.h: Move over to the new border color enums	2015-07-08 11:44:52 -07:00
Jason Ekstrand	7598329774	vk/vulkan.h: Move VkFormatProperties	2015-07-08 11:16:45 -07:00
Jason Ekstrand	52940e8fcf	vk/vulkan.h: Add RenderPassBeginContents	2015-07-08 10:57:13 -07:00
Jason Ekstrand	e19d6be2a9	vk/vulkan.h: Add command buffer levels	2015-07-08 10:53:32 -07:00
Jason Ekstrand	c84f2d3b8c	vk/vulkan.h: Import the VkPipeEvent enum from 130 Now, VkPipeEventFlags is back in sync with VkPipeEvent	2015-07-08 10:49:46 -07:00
Jason Ekstrand	b20cc72603	vk/vulkan.h: Remove VkFormatInfoType	2015-07-08 10:39:31 -07:00
Jason Ekstrand	8e05bbeee9	vk/vulkan.h: Update extension handling to rev 130	2015-07-08 10:38:07 -07:00
Jason Ekstrand	cc29a5f4be	vk/vulkan.h: Move format quering to the physical device	2015-07-08 09:34:47 -07:00
Jason Ekstrand	719fa8ac74	vk/vulkan.h: Remove some peer opening structs and STRUCTURE_TYPE enums	2015-07-08 09:25:13 -07:00
Jason Ekstrand	fc6dcc6227	vk: Add a copy of the v90 header.	2015-07-08 09:23:29 -07:00
Jason Ekstrand	12119282e6	vk/vulkan.h: Remove an unneeded comment	2015-07-08 09:18:09 -07:00
Jason Ekstrand	3c65a1ac14	vk/vulkan.h: Remove the MemoryRange stubs and add sparse stubs	2015-07-08 09:16:48 -07:00
Jason Ekstrand	bb6567f5d1	vk/vulkan.h: Switch BindObjectMemory to a device function and remove the index	2015-07-08 09:04:16 -07:00
Jason Ekstrand	e7acdda184	vk/vulkan.h: Switch to the split ProcAddr functions in 130	2015-07-07 18:51:53 -07:00
Jason Ekstrand	db24afee2f	vk/vulkan.h: Switch from GetImageSubresourceInfo to GetImageSubresourceLayout	2015-07-07 18:20:18 -07:00
Jason Ekstrand	ef8980e256	vk/vulkan.h: Switch from GetObjectInfo to GetMemoryRequirements	2015-07-07 18:16:42 -07:00
Jason Ekstrand	d9c2caea6a	vk: Update memory flushing functions to 130 This involves updating the prototype for FlushMappedMemory, adding InvalidateMappedMemoryRanges, and removing PinSystemMemory.	2015-07-07 17:22:31 -07:00
Jason Ekstrand	d5349b1b18	vk/vulkan.h: Constify the pFences parameter to ResetFences	2015-07-07 17:18:00 -07:00
Jason Ekstrand	6aa1b89457	vk/vulkan.h: Move the definitions of Create(Framebuffer\|RenderPass) This better matches the 130 header.	2015-07-07 17:13:10 -07:00
Jason Ekstrand	0ff06540ae	vk: Implement the GetRenderAreaGranularity function At the moment, we're just going to scissor clears so a granularity of 1x1 is all we need.	2015-07-07 17:11:37 -07:00
Jason Ekstrand	435b062b26	vk/vulkan.h: Add a PipelineLayout parameter to BindDescriptorSets	2015-07-07 17:06:10 -07:00
Jason Ekstrand	518ca9e254	vk/vulkan.h: Add a compareEnable parameter to SamplerCreateInfo Our hardware doesn't actually need this, so adding it is a no-op.	2015-07-07 16:49:04 -07:00
Jason Ekstrand	672590710b	vk/vulkan.h: Remove initialCount from SemaphoreCreateInfo	2015-07-07 16:42:42 -07:00
Jason Ekstrand	80046a7d54	vk/vulkan.h: Update clear color handling to 130	2015-07-07 16:37:43 -07:00
Jason Ekstrand	3e4b00d283	meta: Use the VkClearColorValue structure for the color attribute	2015-07-07 16:27:06 -07:00
Jason Ekstrand	a35fef1ab2	vk/vulkan.h: Remove the pass argument from EndRenderPass	2015-07-07 16:22:23 -07:00
Jason Ekstrand	d2ca7e24b4	vk/vulkan.h: Rename VertexInputStateInfo to VertexInputStateCreateInfo	2015-07-07 16:15:55 -07:00
Jason Ekstrand	abbb776bbe	vk/vulkan.h: Remove programPointSize Instead, we auto-detect whether or not your shader writes gl_PointSize. If it does, we use 1.0, otherwise we take it from the shader.	2015-07-07 16:00:46 -07:00
Chad Versace	e7ddfe03ab	vk/0.130: Stub vkCmdClear*Attachment() funcs vkCmdClearColorAttachment vkCmdClearDepthStencilAttachment	2015-07-07 15:57:37 -07:00
Chad Versace	f89e2e6304	vk/0.130: Define enum VkImageAspectFlagBits	2015-07-07 15:57:37 -07:00
Chad Versace	55ab1737d3	vk/0.130: Define VkRect3D	2015-07-07 15:55:53 -07:00
Chad Versace	11901a9100	vk/0.130: Update name of vkCmdClearDepthStencilImage()	2015-07-07 15:53:35 -07:00
Chad Versace	dff32238c7	vk/0.130: Stub vkCmdExecuteCommands()	2015-07-07 15:51:55 -07:00
Chad Versace	85c0d69be9	vk/0.130: Update vkCmdWaitEvents() signature	2015-07-07 15:49:57 -07:00
Chad Versace	0ecb789b71	vk: Remove unused 'v' param from stub() macro	2015-07-07 15:47:24 -07:00
Chad Versace	f78d684772	vk: Stub vkCmdPushConstants() from 0.130 header	2015-07-07 15:46:19 -07:00
Chad Versace	18ee32ef9d	vk: Update vkCmdPipelineBarrier to 0.130 header	2015-07-07 15:43:41 -07:00
Chad Versace	4af79ab076	vk: Add func anv_clear_mask() A little helper func for inspecting and clearing bitmasks.	2015-07-07 15:43:41 -07:00
Jason Ekstrand	788a8352b9	vk/vulkan.h: Remove some unused fields. In particular, the following are removed: - disableVertexReuse - clipOrigin - depthMode - pointOrigin - provokingVertex	2015-07-07 15:33:00 -07:00
Jason Ekstrand	7fbed521bb	vk/vulkan.h: Remove the explicit primitive restart index Unfortunately, this requires some non-trivial changes to the driver. Now that the primitive restart index isn't given explicitly by the client, we always use ~0 for everything like D3D does. Unfortunately, our hardware is awesome and a 32-bit version of ~0 doesn't match any 16-bit values. This means, we have to set it to either UINT16_MAX or UINT32_MAX depending on the size of the index type. Since we get the index type from CmdBindIndexBuffer and the rest of the VF packet from the pipeline, we need to lazy-emit the VF packet.	2015-07-07 15:33:00 -07:00
Chad Versace	d6b840beff	vk: Delete some comments not present in 0.130 header Deleting the comments reduces diff noise.	2015-07-07 15:16:13 -07:00
Chad Versace	84a5bc25e3	vk: Pull in remaining 0.130 handle types This pulls in the definition of VkShaderModule and VkPipelineCache, which nowhere used yet.	2015-07-07 15:13:01 -07:00
Chad Versace	f2899b1af2	vk: Pull in #defines from 0.130 header Despite not being used yet, pulling in the macros does diminish the header diff.	2015-07-07 15:11:30 -07:00
Jason Ekstrand	962d6932fa	vk/vulkan.h: Rename (min\|max)Depth to (min\|max)DepthBounds	2015-07-07 12:37:54 -07:00
Jason Ekstrand	1fb859e4b2	vk/vulkan.h: Remove client-settable pointSize from DynamicRsState	2015-07-07 12:35:32 -07:00
Jason Ekstrand	245583075c	vk/vulkan.h: Remove UINT8 index buffers	2015-07-07 11:26:49 -07:00
Jason Ekstrand	0a42332904	vk/vulkan.h: Re-order the object declarations	2015-07-07 11:26:49 -07:00
Kristian Høgsberg Kristensen	a1eea996d4	vk: Emit 3DSTATE_SAMPLE_MASK This was missing and was causing the driver to not work with execlists. Presumably we get a different initial hw context with execlists enabled, that has sample mask 0 initially. Set this to 0xffff for now. When we add MS support, we need to take the value from VkPipelineMsStateCreateInfo::sampleMask.	2015-07-06 23:54:12 -07:00
Kristian Høgsberg Kristensen	c325bb24b5	vk: Pull in new generated headers The new headers use stdbool for enable/disable fields which implicitly converts expressions like (flags & 8) to 0 or 1. Also handles MBO (must-be-one) fields by setting them to one, corrects a bspec typo (_3DPRIM_LISTSTRIP_ADJ -> LINESTRIP) and makes a few enum values less clashy.	2015-07-06 22:12:26 -07:00
Chad Versace	23075bccb3	vk/image: Validate vkCreateImageView more Exhaustively validate the function input. If it's not validated and doesn't have an anv_finishme(), then I overlooked it.	2015-07-06 18:28:26 -07:00
Chad Versace	69e11adecc	vk/image: Add more info to VkImageViewType table Convert the table from the direct mapping VkImageViewType -> SurfaceType into a mapping to an info struct VkImageViewType -> struct anv_image_view_info	2015-07-06 18:28:26 -07:00
Chad Versace	b844f542e0	vk: Update VkImageViewType to 0.130.0 This splits 1D and 1D_ARRAY, 2D and 2D_ARRAY, CUBE and CUBE_ARRAY. The new tokens are unused. This is just a header update.	2015-07-06 18:28:26 -07:00
Chad Versace	5b04db71ff	vk/image: Move validation for vkCreateImageView Move the validation from anv_CreateImageView() and anv_image_view_init() to anv_validate_CreateImageView(). No new validation is added.	2015-07-06 18:27:14 -07:00
Jason Ekstrand	1f1b26bceb	vk/vulkan.h: Rename VkRect to VkRect2D	2015-07-06 17:47:18 -07:00
Jason Ekstrand	63c1190e47	vk/vulkan.h: Rename count to arraySize in VkDescriptorSetLayoutBinding	2015-07-06 17:43:58 -07:00
Jason Ekstrand	d84f3155b1	vk/vulkan.h: Remove the Vk(Memory\|Semaphor\|Image)OpenInfo structs We already deleted the functions that need them. The structs are just dangling uselessly.	2015-07-06 17:37:13 -07:00
Jason Ekstrand	65f9ccb4e7	vk/vulkan.h: Remove VK_MEMORY_PROPERTY_PREFER_HOST_LOCAL_BIT We weren't doing anything with it, so this is a no-op	2015-07-06 17:33:45 -07:00
Jason Ekstrand	68fa750f2e	vk/vulkan.h: Replace DEVICE_COHERENT_BIT with DEVICE_NON_COHERENT_BIT	2015-07-06 17:32:28 -07:00
Jason Ekstrand	d5b5bd67f6	vk/vulkan.h: Use the query result bits from revision 130 None of the important bits or names actually changed. It just added/removed some no-op names. No functional change.	2015-07-06 17:27:11 -07:00
Jason Ekstrand	d843418c2e	vk/vulkan.h: One more quick enum refactor clean-up	2015-07-06 17:26:29 -07:00
Jason Ekstrand	2b37fc28d1	vk/vulkan.h: Get rid of VERTEX_INPUT_STEP_RATE_DRAW We never supported it, so no functional change.	2015-07-06 17:24:26 -07:00
Jason Ekstrand	a75967b1bb	vk/vulkan.h: Remove the CLEAR_OPTIMAL image layout	2015-07-06 17:21:19 -07:00
Jason Ekstrand	2b404e5d00	vk: Rename CPU_READ/WRITE_BIT to HOST_READ/WRITE_BIT	2015-07-06 17:18:25 -07:00
Jason Ekstrand	c57ca3f16f	vk/vulkan.h: Remove VK_IMAGE_CREATE_CLONEABLE_BIT	2015-07-06 17:14:30 -07:00
Jason Ekstrand	2de388c49c	vk: Remove SHAREABLE bits They were removed from the Vulkan API and we don't really use them because there are no multi-GPU i965 systems.	2015-07-06 17:12:51 -07:00
Jason Ekstrand	1b0c47bba6	vk/vulkan.h: Re-order the logic op enums	2015-07-06 17:08:11 -07:00
Jason Ekstrand	c7cef662d0	vk/vulkan.h: Reformat a bunch of enums to match revision 130 In theory, no functional change.	2015-07-06 17:06:02 -07:00
Jason Ekstrand	8c5e48f307	vk: Rename NUM_SHADER_STAGE to SHADER_STAGE_NUM This is a refactor of more than just the header but it lets us finish reformating the shader stage enum.	2015-07-06 16:43:28 -07:00
Jason Ekstrand	d9176f2ec7	vk: Reformat a bunch of enums This accounts for a number differences between the generated headers and the hand-written header. Not all reformatting is done in this commit but it does make the headers much more diffable. In theory, no functional change.	2015-07-06 16:41:31 -07:00
Jason Ekstrand	e95bf93e5a	vk: Pull the VkResult enum from revision 130	2015-07-06 16:15:12 -07:00
Jason Ekstrand	1b7b580756	vk: re-arrange enums to match the order in revision 130	2015-07-06 16:11:05 -07:00
Jason Ekstrand	2fb524b369	vk: Rename a parameter in CmdBindDynamicStateObject	2015-07-06 15:37:17 -07:00
Jason Ekstrand	c5ffcc9958	vk: Remove multi-device stuff	2015-07-06 15:34:55 -07:00
Jason Ekstrand	c5ab5925df	vk: Remove ClearDescriptorSets	2015-07-06 15:32:40 -07:00
Jason Ekstrand	ea5fbe1957	vk: Remove begin/end descriptor pool update	2015-07-06 15:32:27 -07:00
Jason Ekstrand	9a798fa946	vk: Remove stub for CloneImageData	2015-07-06 15:30:05 -07:00
Jason Ekstrand	78a0d23d4e	vk: Remove the stub support for memory priorities	2015-07-06 15:28:10 -07:00
Jason Ekstrand	11cf214578	vk: Remove the stub support for explicit memory references	2015-07-06 15:27:58 -07:00
Jason Ekstrand	0dc7d4ac8a	vk/vulkan.h: Reformat structs to match revision 130 Structs in the old version were specified as typedef struct VkSomeThing_ { type field; // comment } VkSomeThing; However, in the generated headers, you have typedef struct { type field; } VkSomeThing; This commit also removes some unneeded whitespaces.	2015-07-06 15:19:12 -07:00
Jason Ekstrand	19aabb5730	vk/vulkah.h: Re-arrange structures to match the order in 130	2015-07-06 15:09:30 -07:00
Connor Abbott	f9dbc34a18	nir/spirv: fix some bugs	2015-07-06 15:00:37 -07:00
Connor Abbott	f3ea3b6e58	nir/spirv: add support for builtins inside structures We may be able to revert this depending on the outcome of bug 14190, but for now it gets vertex shaders working with SPIR-V.	2015-07-06 15:00:37 -07:00
Connor Abbott	15047514c9	nir/spirv: fix a bug with structure creation We were creating 2 extra bogus fields.	2015-07-06 15:00:37 -07:00
Connor Abbott	73351c6a18	nir/spirv: fix a bad assertion in the decoration handling We should be asserting that the parent decoration didn't hand us a member if the child decoration did, but different child decorations may obviously have different members.	2015-07-06 15:00:37 -07:00
Connor Abbott	70d2336e7e	nir/spirv: pull out logic for getting builtin locations Also add support for more builtins.	2015-07-06 15:00:37 -07:00
Connor Abbott	aca5fc6af1	nir/spirv: plumb through the type of dereferences We need this to know if a deref is of a builtin.	2015-07-06 15:00:37 -07:00
Connor Abbott	66375e2852	nir/spirv: handle structure member builtin decorations	2015-07-06 15:00:37 -07:00
Connor Abbott	23c179be75	nir/spirv: add a vtn_type struct This will handle decorations that aren't in the glsl_type.	2015-07-06 15:00:37 -07:00
Connor Abbott	f9bb95ad4a	nir/spirv: move 'type' into the union Since SSA values now have their own types, it's more convenient to make 'type' only used when we want to look up an actual SPIR-V type, since we're going to change its type soon to support various decorations that are handled at the SPIR-V -> NIR level.	2015-07-06 15:00:37 -07:00
Jason Ekstrand	d5dccc1e7a	vk: Move CreateFramebuffer and CreateRenderPass higher in the header This matches where they are in the 130 header.	2015-07-06 14:41:43 -07:00
Jason Ekstrand	4a42f45514	vk: Remove atomic counters stubs	2015-07-06 14:38:45 -07:00
Jason Ekstrand	630b19a1c8	vk: Make vulkan.h look more like vulkan-130.h Most of these changes are insubstantial. The only potentially substantial cyhange is that we added a few new #defines for API maximums.	2015-07-06 14:32:52 -07:00
Jason Ekstrand	2f9180b1b2	vk: Add a revision 130 header along-side the current header	2015-07-06 14:16:51 -07:00
Jason Ekstrand	1f1465f077	vk/meta: Add an initial implementation of ClearColorImage	2015-07-02 18:15:06 -07:00
Jason Ekstrand	8a6c8177e0	vk/meta: Factor the guts out of cmd_buffer_clear	2015-07-02 18:13:59 -07:00
Jason Ekstrand	beb0e25327	vk: Roll back to API v90 This is what version 0.1 of the Vulkan SDK is built against.	2015-07-01 16:44:12 -07:00
Jason Ekstrand	fa663c27f5	nir/spirv: Add initial structure member decoration support	2015-07-01 15:38:26 -07:00
Jason Ekstrand	e3d60d479b	nir/spirv: Make vtn_handle_type match the other handler functions Previously, the caller of vtn_handle_type had to handle actually inserting the type. However, this didn't really work if the type was decorated in any way.	2015-07-01 15:34:10 -07:00
Jason Ekstrand	7a749aa4ba	nir/spirv: Add basic support for Op[Group]MemberDecorate	2015-07-01 14:18:07 -07:00
Jason Ekstrand	682eb9489d	vk/x11: Allow for the client querying the size of the format properties	2015-07-01 14:18:07 -07:00
Chad Versace	bba767a9af	vk/formats: Fix entry for S8_UINT I forgot to update this when fixing the depth formats.	2015-06-30 09:41:44 -07:00
Chad Versace	6720b47717	vk/formats: Document new meaning of anv_format::cpp The way the code currently works is that anv_format::cpp is the cpp of anv_format::surface_format. Me and Kristian disagree about how the code should work. Despite that, I think it's in our discussion's best interest to document how the code currently works. That should eliminate confusion. If and when the code begins to work differently, then we'll update the anv_format comments.	2015-06-30 09:41:41 -07:00
Chad Versace	709fa463ec	vk/depth: Add a FIXME 3DSTATE_DEPTH_BUFFER.Width,Height are wrong.	2015-06-26 22:15:03 -07:00
Chad Versace	5b3a1ceb83	vk/image: Enable 2d single-sample color miptrees What's been tested, for both image views and color attachment views: - VK_FORMAT_R8G8B8A8_UNORM - VK_IMAGE_VIEW_TYPE_2D - mipLevels: 1, 2 - baseMipLevel: 0, 1 - arraySize: 1, 2 - baseArraySlice: 0, 1 What's known to be broken: - Depth and stencil miptrees. To fix this, anv_depth_stencil_view needs major rework. - VkImageViewType != 2D - MSAA Fixes Crucible tests: func.miptree.view-2d.levels02.array01.* func.miptree.view-2d.levels01.array02.* func.miptree.view-2d.levels02.array02.*	2015-06-26 22:11:15 -07:00
Chad Versace	c6e76aed9d	vk/image: Define anv_surface, refactor anv_image This prepares for upcoming miptree support. anv_surface is a proxy for color surfaces, depth surfaces, and stencil surfaces. Embed two instances of anv_surface into anv_image: the primary surface (color or depth), and an optional stencil surface.	2015-06-26 21:45:53 -07:00
Chad Versace	127cb3f6c5	vk/image: Reformat function signatures Reformat them to match Mesa code-style.	2015-06-26 20:12:42 -07:00
Chad Versace	fdcd71f71d	vk/image: Embed VkImageCreateInfo* into anv_image_create_info All function signatures that matched this pattern, old: f(const VkImageCreateInfo , const struct anv_image_create_info ) were rewritten as new: f(const struct anv_image_create_info *)	2015-06-26 20:06:08 -07:00
Chad Versace	ca6cef3302	vk/image: Drop some tmp vars in anv_image_view_init() Variables 'tile_mode' and 'format' are unneeded.	2015-06-26 19:50:04 -07:00
Chad Versace	9c46ba9ca2	vk/image: Abort on stencil image views The code doesn't work. Not even close. Replace the broken code with a FINISHME and abort.	2015-06-26 19:23:21 -07:00
Chad Versace	667529fbaa	vk: Reindent struct anv_image	2015-06-26 15:27:20 -07:00
Chad Versace	74e3eb304f	vk: Define MIN(a, b) macro	2015-06-26 15:09:07 -07:00
Chad Versace	55752fe94a	vk: Rename functions ALIGN_32 -> align_32 ALIGN_U32 and ALIGN_I32 are functions, not macros. So stop using allcaps.	2015-06-26 15:07:59 -07:00
Connor Abbott	6ee082718f	Merge branch 'wip/nir-vtn' into vulkan Adds composites and matrix multiplication, plus some control flow fixes.	2015-06-26 12:14:05 -07:00
Chad Versace	37d6e04ba1	vk/formats: Remove the cpp=0 stencil hack The format table defined cpp = 0 for stencil-only formats. The real cpp is 1. When code begins to lie, especially about stencil buffers, code becomes increasingly fragile as time progresses, and the damage becomes increasingly hard to undo. (For precedent, see the painful history of stencil buffer cpp in the git log for gen6 and gen7 in the i965 driver). Let's undo the stencil buffer cpp lie now to avoid future pain. In the format table, set cpp = 1 for VK_FORMAT_S8; replace checks for cpp == 0; and delete all comments about the hack.	2015-06-26 09:58:22 -07:00
Chad Versace	67a7659d69	vk/image: Refactor anv_image_create() From my experience with intel_mipmap_tree.c, I learned that for struct's like anv_image and intel_mipmap_tree, which have sprawling multi-function construction codepaths, it's easy to mistakenly use unitialized struct members during construction. Let's eliminate the risk of using unitialized anv_image members during construction. Fill the struct at the function bottom instead of piecemeal throughout the constructor.	2015-06-26 09:32:59 -07:00
Chad Versace	5d7103ee15	vk/image: Group some assertions closer together In anv_image_create(), group together the assertions on VkImageCreateInfo.	2015-06-26 09:05:46 -07:00
Chad Versace	0349e8d607	vk/formats: #undef fmt at end of format table	2015-06-26 07:38:02 -07:00
Chad Versace	068b8a41e2	vk: Fix comment for anv_depth_stencil_view::stencil_qpitch s/DEPTH/STENCIL/	2015-06-26 07:31:57 -07:00
Chad Versace	7ea707a42a	vk/image: Add qpitch fields to anv_depth_stencil_view For now, hard-code them to 0.	2015-06-25 20:10:16 -07:00
Chad Versace	b91a76de98	vk: Reindent and document struct anv_depth_stencil_view	2015-06-25 20:10:16 -07:00
Chad Versace	ebe1e768b8	vk/formats: Fix incorrect depth formats anv_format::surface_format was incorrect for Vulkan depth formats. For example, the format table mapped VK_FORMAT_D24_UNORM -> .surface_format = D24_UNORM_X8_UINT VK_FORMAT_D32_FLOAT -> .surface_format = D32_FLOAT but should have mapped VK_FORMAT_D24_UNORM -> .surface_format = R24_UNORM_X8_TYPELESS VK_FORMAT_D32_FLOAT -> .surface_format = R32_FLOAT The Crucible test func.depthstencil.basic passed despite the bug, but only because it did not attempt to texture from the depth surface. The core problem is that RENDER_SURFACE_STATE.SurfaceFormat and 3DSTATE_DEPTH_BUFFER.SurfaceFormat are distinct types. Considering them as enum spaces, the two enum spaces have incompatible collisions. Fix this by adding a new field 'depth_format' to struct anv_format. Refer to brw_surface_formats.c:translate_tex_format() for precedent.	2015-06-25 20:10:16 -07:00
Chad Versace	45b804a049	vk/image: Rename local variable in anv_image_create() This function has many local variables for info structs. Having one named simply 'info' is confusing. Rename it to 'format_info'.	2015-06-25 20:10:16 -07:00
Chad Versace	528071f004	vk/formats: Fix table entry for R8G8B8_SNORM Now that anv_formats[] is formatted like a table, buggy entries are easier to see.	2015-06-25 20:10:16 -07:00
Chad Versace	4c8146313f	vk/formats: Rename anv_format::format -> surface_format I misinterpreted anv_format::format as a VkFormat. Instead, it is a hardware surface format (RENDER_SURFACE_STATE.SurfaceFormat). Rename the field to 'surface_format' to make it unambiguous.	2015-06-25 20:10:16 -07:00
Chad Versace	4b8b451a1d	vk/formats: Rename anv_format::channels -> num_channels I misinterpreted anv_format::channels as a bitmask of channels. Renaming it to 'num_channels' makes it unambiguous.	2015-06-25 20:10:16 -07:00
Chad Versace	af0ade0d6c	vk: Reindent struct anv_format	2015-06-25 20:10:16 -07:00
Chad Versace	ae29fd1b55	vk/formats: Don't abbreviate tokens in the format table Abbreviating the VK_FORMAT_* tokens doesn't help much. To the contrary, it means grep and ctags can't find them.	2015-06-25 20:10:16 -07:00
Jason Ekstrand	d5e41a3a99	vk/compiler: Add the initial hacks to get SPIR-V up and going	2015-06-25 17:36:35 -07:00
Jason Ekstrand	c4c1d96a01	HACK: Get rid of sanity_param_count for FS	2015-06-25 17:36:34 -07:00
Jason Ekstrand	4f5ef945e0	i965: Don't print the GLSL IR if it doesn't exist	2015-06-25 17:36:34 -07:00
Jason Ekstrand	588acdb431	nir/spirv: Set the right location for shader input/outputs We need to add FRAG_RESULT_DATA0 etc. to the input/output location.	2015-06-25 17:36:34 -07:00
Jason Ekstrand	333b8ddd6b	nir/spirv: Set the interface type on uniform blocks	2015-06-25 17:36:34 -07:00
Jason Ekstrand	7e1792b1b7	nir/spirv: Set the system value mode on builtins	2015-06-25 17:36:34 -07:00
Jason Ekstrand	b72936fdad	nir/spirv: Actually put variables on the right linked list	2015-06-25 17:36:34 -07:00
Jason Ekstrand	ee0a8f23e4	glsl: Move vert_attrib varying_slot and frag_result enums to shader_enums.h	2015-06-25 17:36:34 -07:00
Chad Versace	fa352969a2	vk/image: Check extent does not exceed surface type limits	2015-06-25 16:53:24 -07:00
Chad Versace	99031aa0f3	vk/image: Stop hardcoding SurfaceType of VkImageView Instead, translate VkImageViewType to a gen SurfaceType.	2015-06-25 16:53:22 -07:00
Chad Versace	7ea121687c	vk/image: Add anv_image::surf_type This the gen SurfaceType, such as SURFTYPE_2D.	2015-06-25 16:52:16 -07:00
Chad Versace	cb30acaced	vk/image: Add tables for gen SurfaceType Tables for mapping VkImageType and VkImageViewType to gen SurfaceType. Tables are unused.	2015-06-25 16:52:16 -07:00
Chad Versace	1132080d5d	vk/util: Add anv_loge() for logging error messages	2015-06-25 16:52:16 -07:00
Chad Versace	5f2d469e37	vk: Add func anv_is_aligned()	2015-06-25 16:52:16 -07:00
Chad Versace	f7fb7575ef	vk: Add anv_minify()	2015-06-25 16:52:05 -07:00
Chad Versace	7cec6c5dfd	vk: Define MAX(a, b) macro	2015-06-25 16:29:42 -07:00
Jason Ekstrand	d178e15567	nir/spirv: Fix up some dererf ralloc parenting	2015-06-24 21:39:07 -07:00
Jason Ekstrand	845002e163	i965/nir: Handle returns as long as they're at the end of a function	2015-06-24 21:38:49 -07:00
Jason Ekstrand	2ecac045a4	i965/nir: Split NIR shader handling into two functions The brw_create_nir function takes a GLSL or ARB shader and turns it into a NIR shader. The guts of the optimization and lowering code is now split into a new brw_process_shader function.	2015-06-24 21:22:07 -07:00
Jason Ekstrand	e369a0eb41	nir/spirv: Use vtn_ssa_value for texture coordinates	2015-06-24 20:39:37 -07:00
Jason Ekstrand	d0bd2bc604	nir/spirv: Add support for the Uniform storage class This is kida sketchy. I'm not really sure this is the way it's supposed to be used.	2015-06-24 20:32:05 -07:00
Jason Ekstrand	ba0d9d33d4	nir/spirv: Add support for some more decorations including built-in	2015-06-24 20:30:32 -07:00
Jason Ekstrand	1bc0a1ad98	nir/spirv: Make the header file C++ safe	2015-06-24 19:01:10 -07:00
Jason Ekstrand	88d02a1b27	vk: Build xmlconfig stuff into libi965_compiler	2015-06-24 15:59:09 -07:00
Kristian Høgsberg Kristensen	24dff4f8fa	vk/headers: Handle MBO fields These must be set to one.	2015-06-24 09:37:50 -07:00
Jason Ekstrand	a62edcce4e	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-06-23 18:05:25 -07:00
Connor Abbott	dee4a94e69	nir/vtn: add support for phi nodes	2015-06-23 10:34:55 -07:00
Connor Abbott	fe1269cf28	nir/builder: add support for inserting before/after blocks	2015-06-23 10:34:22 -07:00
Connor Abbott	9a3dda101e	nir/vtn: fix emitting code after loops When we're done emitting the code for a loop, we need to visit the new break block, which is the merge block of the current loop, rather than the old merge block, which is the merge block of the loop containing the one we just emitted code for.	2015-06-22 13:53:08 -07:00
Connor Abbott	e9c21d0ca0	unbreak things	2015-06-22 11:59:55 -07:00
Kristian Høgsberg Kristensen	9b9f973ca6	vk: Implement scratch buffers to make spilling work	2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen	9e59003fb1	vk: Undo relocs for scratch bos	2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen	b20794cfa8	vk/allocator: Get rid of non-memfd path We can just use modern valgrind now.	2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen	aba75d0546	vk/headers: Make General State offsets relocations	2015-06-19 15:42:15 -07:00
Connor Abbott	841aab6f50	matrices matrices matrices	2015-06-18 18:52:44 -07:00
Connor Abbott	d0fc04aacf	nir/types: be less strict about constructing matrix types	2015-06-18 18:51:51 -07:00
Connor Abbott	22854a60ef	nir/builder: add a nir_fdot() convenience function	2015-06-18 17:34:55 -07:00
Connor Abbott	0e86ab7c0a	nir/types: add a helper to transpose a matrix type	2015-06-18 17:34:12 -07:00
Connor Abbott	de4c31a085	fix glsl450 for composites	2015-06-18 17:33:08 -07:00
Kristian Høgsberg Kristensen	aedd3c9579	vk: Add missing gen7 RENDER_SURFACE_STATE struct	2015-06-17 21:42:29 -07:00
Connor Abbott	bf5a615659	composites composites composites	2015-06-17 16:25:38 -07:00
Kristian Høgsberg Kristensen	fa8a07748d	vk: Compute CS exec mask and thread width max in pipeline We compute the right mask and thread width max parameters as part of pipeline creation and set them accordingly at vkCmdDispatch() and vkCmdDispatchIndirect() time. These parameters depend only on the local group size and the dispatch width of the program so we can figure this out at pipeline create time.	2015-06-12 18:21:50 -07:00
Kristian Høgsberg Kristensen	c103c4990c	vk: Set binding table layout for CS We weren't setting the binding table layout for the backend compiler.	2015-06-12 18:21:49 -07:00
Kristian Høgsberg Kristensen	2fdd17d259	vk: Generate CS prog_data into the pipeline instance We were generating the prog_data into a local variable and never initializing the pipeline->cs_prog_data one.	2015-06-12 18:21:49 -07:00
Kristian Høgsberg Kristensen	00494c6cb7	vk: Document how depth/stencil formats work in anv_image_create() This reverts commits `e17ed04` * vk/image: Don't double-allocate stencil buffers `1ee2d1c` * vk/image: Teach anv_image_choose_tile_mode about WMAJOR and instead adds a comment to describe the subtlety of how we create images for stencil only formats.	2015-06-11 22:07:16 -07:00
Kristian Høgsberg Kristensen	fbc9fe3c92	vk: Use compute pipeline layout when binding compute sets	2015-06-11 21:57:43 -07:00
Kristian Høgsberg Kristensen	765175f5d1	vk: Implement basic compute shader support	2015-06-11 15:31:42 -07:00
Kristian Høgsberg Kristensen	7637b02aaa	vk: Emit PIPELINE_SELECT on demand	2015-06-11 15:21:49 -07:00
Kristian Høgsberg Kristensen	405697eb3d	vk: Stop asserting we have a fragment shader Even for graphics, this is not a requirement, we can have a depth-only output pipeline.	2015-06-11 15:07:38 -07:00
Kristian Høgsberg Kristensen	e7edde60ba	vk: Defer setting viewport dynamic state We can't emit this until we've done a 3D pipeline select.	2015-06-11 15:04:09 -07:00
Kristian Høgsberg Kristensen	f7fe06cf0a	vk: Disable shader stages in the graphics pipeline batch We need to move this into the graphics pipeline batch so we don't emit it for compute pipelines.	2015-06-11 14:58:31 -07:00
Kristian Høgsberg Kristensen	9aae480cc4	vk: Don't emit STATE_SIP We don't have a SIP kernel and don't enable exceptions.	2015-06-11 14:56:29 -07:00
Kristian Høgsberg Kristensen	923e923bbc	vk: Compile fragment shader after VS and GS Just moving code around to do shader stages in the natual order.	2015-06-11 14:55:50 -07:00
Jason Ekstrand	1dd63fcbed	vk/entrypoints: Don't print every single function call	2015-06-11 10:10:13 -07:00
Kristian Høgsberg Kristensen	b581e924b6	vk: Remove left-over trp call	2015-06-11 09:26:49 -07:00
Kristian Høgsberg Kristensen	d76ea7644a	vk: Set maximum point size range We set both minimum and maximum point size to 0 in 3DSTATE_CLIP, which will clip away all points.	2015-06-11 09:25:04 -07:00
Kristian Høgsberg Kristensen	a5b49d2799	vk: Use generated headers with fixed point support The generated headers now convert float in the template struct to the correct fixed point format.	2015-06-11 09:25:04 -07:00
Kristian Høgsberg Kristensen	ea7ef46cf9	vk: Regenerate headers with __gen_validate_value()	2015-06-11 09:25:03 -07:00
Jason Ekstrand	a566b1e08a	vk/formats: Refactor format properties code Along with the refactor, we now do the right thing when we hit an unsupported format: Set the flags to 0 and return VK_SUCCESS.	2015-06-11 09:11:16 -07:00
Jason Ekstrand	2a3c29698c	vk/image: Add a bunch of asserts	2015-06-10 21:04:51 -07:00
Jason Ekstrand	c8b62d109b	vk: Add a couple vk_error calls	2015-06-10 21:04:13 -07:00
Jason Ekstrand	7153b56abc	vk/private: Add a non-fatal assert	2015-06-10 21:03:50 -07:00
Jason Ekstrand	29d2bbb2b5	vk/cmd: Add an initial implementation of PipelineBarrier We may want to do something more inteligent here later such as actually handling image layout transitions. However, this should do for now.	2015-06-10 16:37:33 -07:00
Jason Ekstrand	047ed02723	vk/emit: Use valgrind to validate every packed field	2015-06-10 12:43:02 -07:00
Jason Ekstrand	9cae3d18ac	vk: Add valgrind checks in various emit functions The check in batch_bo_finish should catch any undefined values in the batch but isn't that great for debugging. The checks in the various emit functions will help get better granularity.	2015-06-09 21:51:37 -07:00
Jason Ekstrand	d5ad24e39b	vk: Move the valgrind include and VG() macro to private.h	2015-06-09 21:51:37 -07:00
Chad Versace	e17ed04b03	vk/image: Don't double-allocate stencil buffers If the main surface has format S8_UINT, then don't allocate the auxiliary stencil surface.	2015-06-09 16:39:28 -07:00
Chad Versace	1ee2d1c3fc	vk/image: Teach anv_image_choose_tile_mode about WMAJOR	2015-06-09 16:38:55 -07:00
Chad Versace	2d2e148952	vk/util: Add anv_abortf(), anv_abortfv() Convenience functions to print an error message then abort.	2015-06-09 16:38:50 -07:00
Chad Versace	ffb1ee5d20	vk: Define anv_noreturn macro	2015-06-09 16:38:46 -07:00
Chad Versace	f1db3b3869	vk/image: Factor tile mode selection into separate function Because it will eventually need to get smarter.	2015-06-09 16:38:42 -07:00
Jason Ekstrand	11e941900a	vk/device: Actually allow destruction	2015-06-09 16:28:46 -07:00
Jason Ekstrand	5d4b6a01af	vk/cmd_buffer: Properly initialize/reset dynamic states	2015-06-09 16:27:55 -07:00
Jason Ekstrand	634a6150b9	vk/pipeline: Zero out the depth-stencil state when not in use	2015-06-09 16:26:55 -07:00
Jason Ekstrand	919e7b7551	vk/device: Use anv_CreateDynamicViewportState instead of the vk one	2015-06-09 16:01:56 -07:00
Jason Ekstrand	0599d39dd9	vk/device: Dedent the vkCreateDynamicViewportState call	2015-06-09 15:53:26 -07:00
Chad Versace	d57c4cf999	vk/util: Annotate anv_finishme() as printflike	2015-06-09 14:46:49 -07:00
Chad Versace	822cb16abe	vk: Define anv_printflike() macro	2015-06-09 14:46:45 -07:00
Chad Versace	081f617b5a	vk/image: Stop hardcoding alignment of stencil surfaces Look up the alignment from anv_tile_info_table.	2015-06-09 14:16:56 -07:00
Chad Versace	e6bd568f36	vk/image: Rewrite tile info table - Reduce the number of table lookups in anv_image_create from 4 to 1. - Add field for surface alignment. - Shorten field names tile_width, tile_height -> width, height.	2015-06-09 14:16:45 -07:00
Chad Versace	5b777e2bcf	vk/image: Delete an old comment	2015-06-09 14:14:29 -07:00
Jason Ekstrand	d842a6965f	vk/compiler: Free the GL errors data	2015-06-09 12:36:23 -07:00
Jason Ekstrand	9f292219bf	vk/compiler: Free more of prog_data when tearing down a pipeline	2015-06-09 12:36:23 -07:00
Jason Ekstrand	66b00d5e5a	vk/queue: Embed the queue in and allocate it with the device	2015-06-09 12:36:23 -07:00
Jason Ekstrand	38f5eef59d	vk/device: Free border color states when we have valgrind	2015-06-09 12:36:23 -07:00
Jason Ekstrand	999b56c507	vk/device: Destroy all batch buffers Due to a copy+paste error, we were destroying all but the first batch or surface state buffer. Now we destroy them all.	2015-06-09 12:36:23 -07:00
Jason Ekstrand	3a38b0db5f	vk/meta: Clean up temporary objects	2015-06-09 12:36:23 -07:00
Jason Ekstrand	9d6f55dedf	vk/surface_view: Add a destructor	2015-06-09 12:36:23 -07:00
Chad Versace	e6162c2fef	vk/image: Add anv_image::h_align,v_align Use the new fields to compute RENDER_SURFACE_STATE.Surface*Alignment. We still hardcode them to 4, though.	2015-06-09 12:19:24 -07:00
Jason Ekstrand	58afc24e57	vk/allocator: Remove the concept of a slave block pool This reverts commit `d24f8245db`.	2015-06-08 17:46:32 -07:00
Jason Ekstrand	b6363c3f12	vk/device: Remove the binding table pools/streams	2015-06-08 17:45:57 -07:00
Jason Ekstrand	531549d9fc	vk/pipeline: Move freeing the program stream to pipeline.c It's created in pipeline.c so we should free it there.	2015-06-08 14:27:04 -07:00
Jason Ekstrand	66a4dab89a	vk/pipeline: Don't destroy the program stream It's freed in compiler.cpp and we don't want to free it twice.	2015-06-08 13:53:19 -07:00
Jason Ekstrand	920fb771d4	vk/allocator: Make the use of NULL_BLOCK in state_stream_finish explicit	2015-06-08 13:53:19 -07:00
Kristian Høgsberg Kristensen	52637c0996	vk: Quiet a few warnings	2015-06-08 08:51:40 -07:00
Kristian Høgsberg Kristensen	9eab70e54f	vk: Create a minimal context for the compiler This avoids the full brw context initialization and just sets up context constants, initializes extensions and sets a few driver vfuncs for the front-end GLSL compiler.	2015-06-08 08:51:40 -07:00
Jason Ekstrand	ce00233c13	vk/cmd_buffer: Use the dynamic state stream in emit_dynamic and merge_dynamic	2015-06-05 17:26:41 -07:00
Jason Ekstrand	e69588b764	vk/device: Use a 64-byte alignment for CC state	2015-06-05 17:26:26 -07:00
Jason Ekstrand	c2eeab305b	vk/pipeline: Actually free the program stream and dynamic pool	2015-06-05 17:26:26 -07:00
Jason Ekstrand	ed2ca020f8	vk/allocator: Avoid double-free in the bo pool	2015-06-05 17:12:28 -07:00
Jason Ekstrand	aa523d3c62	vk/gem: Call VALGRIND_FREELIKE_BLOCK before unmapping	2015-06-05 16:41:49 -07:00
Chad Versace	87d98e1935	vk: Fix 2 incorrect typecasts The compiler didn't find the cast errors because all Vulkan types are just integers.	2015-06-04 14:32:22 -07:00
Chad Versace	b981379bcf	vk: Make `make clean` remove generated spirv headers	2015-06-04 14:26:46 -07:00
Jason Ekstrand	8d930da35d	vk/allocator: Remove an unneeded VG() wrapper	2015-06-04 09:14:33 -07:00
Jason Ekstrand	7f90e56e42	vk/device: Dissalow device destruction	2015-06-04 09:14:33 -07:00
Chad Versace	9cd42b3dea	vk: Fix build Commit 1286bd, which deleted vk.c, broke the build. Update the Makefile to fix it.	2015-06-04 09:01:30 -07:00
Jason Ekstrand	251aea80b0	vk/DS: Mask stencil masks to 8 bits	2015-06-03 16:59:13 -07:00
Connor Abbott	47bd462b0c	awesome control flow bugfixes/clarifications	2015-06-03 14:10:28 -04:00
Kristian Høgsberg Kristensen	a37d122e88	vk: Set color/blend state in meta clear if not set yet	2015-06-02 23:08:05 -07:00
Kristian Høgsberg Kristensen	1286bd3160	vk: Delete vk.c test case We now have crucible up and running and all vk sub-cases have been moved over. Delete this crufty old hack of a test case.	2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen	2f6aa424e9	vk: Update generated headers with support for 64 bit fields	2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen	5744d1763c	vk: Set cb_state to NULL at cmd buffer create time Dynamic color/blend state can be NULL in case we're not rendering to color targets (only output to depth and/or stencil). Initialize cmd_buffer->cb_state to NULL so we can reliably detect whether it's been set or not.	2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen	c8f078537e	vk: Implement vertexOffset parameter of vkCmdDrawIndexed() As exposed by the func.draw_indexed test, we were ignoring the argument and hardcoding 0.	2015-06-02 22:57:42 -07:00
Jason Ekstrand	e702197e3f	vk/formats: Add a name to the metadata and better logging	2015-06-02 11:30:39 -07:00
Jason Ekstrand	fbafc946c6	vk/formats: Rework the formats table	2015-06-02 11:30:39 -07:00
Kristian Høgsberg Kristensen	f98c89ef31	vk: Move query related functionality to new file query.c	2015-06-01 21:52:45 -07:00
Jason Ekstrand	08748e3a0c	i965: Use NIR by default for vertex shaders on GEN8+ GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2742062 -> 2681339 (-2.21%) instructions in affected programs: 1514770 -> 1454047 (-4.01%) helped: 5813 HURT: 1120 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as "gained" in the shader-db results. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-06-01 12:25:58 -07:00
Jason Ekstrand	d4cbf6a728	vk/compiler: Add an index_count to the bind map and check for OOB	2015-06-01 12:25:58 -07:00
Jason Ekstrand	510b5c3bed	vk/HACK: Plumb real descriptor set/index into textures	2015-06-01 12:25:58 -07:00
Jason Ekstrand	aded32bf04	NIR: Add a helper for doing sampler lowering for vulkan	2015-06-01 12:25:58 -07:00
Kristian Høgsberg Kristensen	5caa408579	vk: Indent tables to align '=' at column 48	2015-05-31 22:36:26 -07:00
Kristian Høgsberg Kristensen	76bb658518	vk: Add support for anisotropic bits	2015-05-31 22:15:34 -07:00
Kristian Høgsberg Kristensen	dc56e4f7b8	vk: Implement support for sampler border colors This supports the three Vulkan border color types for float color formats. The support for integer formats is a little trickier, as we don't know the format of the texture at this time.	2015-05-31 17:20:48 -07:00
Jason Ekstrand	e497ac2c62	vk/device: Only flush the texture cache when setting state base address After further examination, it appears that the other flushes and stalls weren't actually needed.	2015-05-30 18:04:50 -07:00
Jason Ekstrand	2251305e1a	vk/cmd_buffer: Track descriptor set dirtying per-stage	2015-05-30 10:07:29 -07:00
Jason Ekstrand	33cccbbb73	vk/device: Emit PIPE_CONTROL flushes surrounding new STATE_BASE_ADDRESS According to the bspec, you're supposed to emit a PIPE_CONTROL with a CS stall and a render target flush prior to chainging STATE_BASE_ADDRESS. A little experimentation, however, shows that this is not enough. It also appears as if you have to flush the texture cache after chainging base address or things won't propagate properly.	2015-05-30 08:08:07 -07:00
Jason Ekstrand	b2b9fc9fad	vk/allocator: Don't call VALGRIND_MALLOCLIKE_BLOCK on fresh gem_mmap's	2015-05-29 21:15:47 -07:00
Jason Ekstrand	03ffa9ca31	vk: Don't crash on partial descriptor sets	2015-05-29 20:43:10 -07:00
Jason Ekstrand	4ffbab5ae0	vk/device: Allow for starting a new surface state buffer This commit allows for us to create a whole new surface state buffer when the old one runs out of room. We simply re-emit the state base address for the new state, re-emit binding tables, and keep going.	2015-05-29 17:49:41 -07:00
Jason Ekstrand	c4bd5f87a0	vk/device: Do lazy surface state emission for binding tables Before, we were emitting surface states up-front when binding tables were updated. Now, we wait to emit the surface states until we emit the binding table. This makes meta simpler and should make it easier to deal with swapping out the surface state buffer.	2015-05-29 16:51:11 -07:00
Kristian Høgsberg Kristensen	4aecec0bd6	vk: Store dynamic slot index with struct anv_descriptor_slot We need to make sure we use the right index into dynamic offset array. Dynamic descriptors can be present or not in different stages and to get the right offset, we need to compute the index at vkCreateDescriptorSetLayout time.	2015-05-29 11:32:53 -07:00
Kristian Høgsberg Kristensen	fad418ff47	vk: Implement dynamic buffer offsets We do this by creating a surface state on the fly that incorporates the dynamic offset. This patch also refactor the descriptor set layout constructor a bit to be less clever with switch statement fall through. Instead of duplicating the subtle code to update the sampler and surface slot map, we just use two switch statements.	2015-05-28 22:41:20 -07:00
Jason Ekstrand	9ffc1bed15	vk/device: Split state base address emit into its own function	2015-05-28 15:34:08 -07:00
Jason Ekstrand	468c89a351	vk/device: Use anv_batch_emit for MI_BATCH_BUFFER_START	2015-05-28 15:25:02 -07:00
Jason Ekstrand	2dc0f7fe5b	vk/device: Actually destroy batch buffers	2015-05-28 13:08:21 -07:00
Jason Ekstrand	8cf932fd25	vk/query: Don't emit a CS stall by itself Both the bspec and the simulator don't like this. I'm not sure if stalling at the scoreboard is right but it at least shuts up the simulator.	2015-05-28 10:27:53 -07:00
Jason Ekstrand	730ca0efb1	vk/device: Fixups for batch buffer chaining Some how these didn't get merged with the other batch buffer chaining stuff. Oh well, it's here now.	2015-05-28 10:26:11 -07:00
Jason Ekstrand	de221a672d	meta: Add a default ds_state and use it when no ds state is set	2015-05-28 10:06:45 -07:00
Jason Ekstrand	6eefeb1f84	vk/meta: Share the dummy RS and CB state between clear and blit	2015-05-28 10:00:38 -07:00
Kristian Høgsberg Kristensen	5a317ef4cb	vk: Initialize dynamic state binding points to NULL We rely on these being initialized to NULL so meta can reliably detect whether or not they've been set. ds_state is also allowed to not be present so we need a well-defined value for that.	2015-05-27 22:13:48 -07:00
Chad Versace	1435bf4bc4	.gitignore: Ignore spirv2nir binary	2015-05-27 17:01:09 -07:00
Chad Versace	f559fe9134	.gitignore: Scope Vulkan's generated source files Don't ignore any file named entrypoints.{c,h}. Ignore it only if it's in src/vulkan.	2015-05-27 16:59:53 -07:00
Chad Versace	ca385dcf2a	vk: gitignore generated source files	2015-05-27 16:57:31 -07:00
Chad Versace	466f61e9f6	vk/glsl_scraper: Replace adhoc arg parsing with argparse	2015-05-27 16:56:02 -07:00
Chad Versace	fab9011c44	vk/image: Assert that VkImageTiling is valid	2015-05-27 16:21:04 -07:00
Chad Versace	c0739043b3	vk/image: Remove trailing whitespace	2015-05-27 16:15:47 -07:00
Chad Versace	4514e63893	vk/glsl: Reject invalid options The script incorrectly interpreted --blah as the input filename.	2015-05-27 16:14:26 -07:00
Chad Versace	fd8b5e0df2	vk/glsl_scraper: Indent large text blocks Indent them to the same level as if the text was code. No changes in entrypoints.{c,h} after a clean build.	2015-05-27 16:09:31 -07:00
Chad Versace	df4b02f4ed	vk/glsl_scraper: Fix code style for imports Python style is one module imported per line, and imports are at the top of the file.	2015-05-27 16:04:12 -07:00
Jason Ekstrand	b23885857f	vk/meta: Actually create the CB state for blits	2015-05-27 12:06:30 -07:00
Jason Ekstrand	da8f148203	vk: Rework anv_batch and use chaining batch buffers This mega-commit primarily does two things. First, is to turn anv_batch into a better abstraction of a batch. Instead of actually having a BO, it now has a few pointers to some piece of memory that are used to add data to the "batch". If it gets to the end, there is a function pointer that it can call to attempt to grow the batch. The second change is to start using chained batch buffers. When the end of the current batch BO is reached, it automatically creates a new one and ineserts an MI_BATCH_BUFFER_START command to chain to it. In this way, our batch buffers are effectively infinite in length.	2015-05-27 11:48:28 -07:00
Jason Ekstrand	59def43fc8	Fixup for growable reloc lists	2015-05-27 11:48:28 -07:00
Jason Ekstrand	1c63575de8	vk/cmd_buffer: Allocate the surface_bo from device->batch_bo_pool	2015-05-27 11:48:28 -07:00
Jason Ekstrand	403266be05	vk/device: Make reloc lists growable	2015-05-27 11:48:28 -07:00
Jason Ekstrand	5ef81f0a05	vk/device: Use a bo pool for batch buffers	2015-05-27 11:48:28 -07:00
Jason Ekstrand	6f3e3c715a	vk/allocator: Add a BO pool	2015-05-27 11:48:28 -07:00
Jason Ekstrand	59328bac10	vk/allocator: Add a free list that acts on pointers instead of offsets	2015-05-27 11:48:28 -07:00
Kristian Høgsberg	a1d30f867d	vk: Add support for dynamic and pipeline color blend state	2015-05-26 17:12:37 -07:00
Kristian Høgsberg	2514ac5547	vk/test: Create and use color/blend dynamic and pipeline state	2015-05-26 17:12:37 -07:00
Kristian Høgsberg	1cd8437b9d	vk/meta: Allocate and set color/blend state For color blend, we have to set our own state to avoid inheriting bogus blend state.	2015-05-26 17:12:37 -07:00
Kristian Høgsberg	610e6291da	vk: Allocate samplers from dynamic stream	2015-05-26 11:50:34 -07:00
Kristian Høgsberg	b29f44218d	vk: Emit color calc state This involves pulling stencil ref values out of DS dynamic state and the blend constant out of CB dynamic state.	2015-05-26 11:27:31 -07:00
Kristian Høgsberg	5e637c5d5a	vk/pack: Generate length macros for structs	2015-05-26 11:27:31 -07:00
Kristian Høgsberg	998837764f	vk: Program depth bias This makes 3DSTATE_RASTER a split state command.	2015-05-26 11:27:31 -07:00
Kristian Høgsberg	0dbed616af	vk: Add support for texture component swizzle This also drops the share create_surface_state helper and moves filling out SURFACE_STATE directly into anv_image_view_init() and anv_color_attachment_view_init().	2015-05-26 11:27:29 -07:00
Kristian Høgsberg	cbe7ed416e	vk: Implement dynamic and pipeline ds state	2015-05-25 20:20:31 -07:00
Kristian Høgsberg	37743f90bc	vk: Set up depth and stencil buffers	2015-05-25 20:20:31 -07:00
Kristian Høgsberg	7c0d0021eb	vk/test: Add new depth-stencil test Not yet a depth stencil test, but will become one.	2015-05-25 20:20:31 -07:00
Kristian Høgsberg	0997a7b2e3	vk: Add basic MOCS settings This matches what we do for GL.	2015-05-25 20:20:31 -07:00
Kristian Høgsberg	c03314bdd3	vk: Update to header files with nested struct support This will let us do MOCS settings right.	2015-05-25 20:20:31 -07:00
Jason Ekstrand	ae8c93e023	vk/cmd_buffer: Initialize the pipeline pointer to NULL If a meta operation is called before the pipeline is set, this can cause uses of undefined values. They should be harmless, but we might as well shut up valgrind on this one too.	2015-05-25 17:14:49 -07:00
Jason Ekstrand	912944e59d	vk/device: Use the correct number of viewports when creating default VP state Fixes valgrind uninitialized value errors	2015-05-25 17:14:49 -07:00
Jason Ekstrand	1b211feb6c	vk/compiler: Zero out the vs_prog_data struct when VS is disabled Prevents uninitialized value errors	2015-05-25 17:14:49 -07:00
Jason Ekstrand	903bd4b056	vk/compiler: Fix up the binding hack and make it work in NIR	2015-05-25 12:57:32 -07:00
Jason Ekstrand	57153da2d5	vk: Actually implement some sort of destructor for all object types	2015-05-22 15:15:08 -07:00
Jason Ekstrand	0f0b5aecb8	vk/pipeline: Track VB's that are actually used by the pipeline Previously, we just blasted out whatever VB's we had marked as "dirty" regardless of which ones were used by the pipeline. Given that the stride of the VB is embedded in the pipeline this can cause problems. One problem is if the pipeline doesn't use the given VB binding we emit a bogus stride. Another problem is that we weren't properly resetting the dirty bits when the pipeline changed.	2015-05-21 16:58:53 -07:00
Jason Ekstrand	0a54751910	vk/device: Memset descriptor sets to 0 and handle descriptor set holes	2015-05-21 16:33:04 -07:00
Jason Ekstrand	519fe765e2	vk: Do relocations in surface states when they are created Previously, we waited until later and did a pass through the used surfaces and did the relocations then. This lead to doing double-relocations which was causing us to get bogus surface offsets.	2015-05-21 15:55:29 -07:00
Jason Ekstrand	ccf2bf9b99	vk/test: Use the glsl_scraper for building shaders	2015-05-21 12:24:02 -07:00
Jason Ekstrand	f3d70e4165	vk/glsl_scraper: Use the LunarG back-door for GLSL source	2015-05-21 12:22:44 -07:00
Jason Ekstrand	cb56372eeb	vk/glsl_scraper: Use a fake GLSL version that glslang will accept	2015-05-21 12:21:02 -07:00
Jason Ekstrand	0e441cde71	vk: Bake the GLSL_VK_SHADER macro into the scraper output file	2015-05-21 12:21:00 -07:00
Jason Ekstrand	f17e835c26	vk/meta: Use glsl_scraper for our GLSL source We are not yet using SPIR-V for meta but this is a first step.	2015-05-21 11:39:54 -07:00
Jason Ekstrand	b13c0f469b	vk: More out-of-tree build fixes	2015-05-21 11:32:59 -07:00
Jason Ekstrand	f294154e42	vk: Fix for out-of-tree builds	2015-05-21 10:23:18 -07:00
Kristian Høgsberg	f9e66ea621	vk: Remove render pass stub call This isn't really a stub.	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	a29df71dd2	vk: Add WSI implementation	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	f886647b75	vk: Add debug stubs	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	63da974529	vk: Mark remaining unsupported formats as such	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	387a1bb58f	vk: Mark VK_FORMAT_UNDEFINED as 1 cpp, 1 channel	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	a1bd426393	vk: Stream surface state instead of using the surface pool Since the binding table pointer is only 16 bits, we can only have 64kb of binding table state allocated at any given time. With a block size of 1kb, that amounts to just 64 command buffers, which is not enough.	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	01504057f5	vk: Use surface_format_info from dri driver for vkGetFormatInfo	2015-05-20 20:34:52 -07:00
Chad Versace	a61f307996	vk: Fix result of vkCreateInstance When fill_physical_device() fails, don't return VK_SUCCESS.	2015-05-20 19:51:10 -07:00
Jason Ekstrand	14929046ba	vk/compiler: Add shader language detection This commit adds support for the LunarG GLSL back-door as well as detecting regular GLSL and SPIR-V. The SPIR-V path doesn't exist yet, so that will cause an assert-fail.	2015-05-20 17:05:41 -07:00
Jason Ekstrand	47c1cf5ce6	vk/test: Add a test for testing buffer copies	2015-05-20 16:20:04 -07:00
Jason Ekstrand	bea66ac5ad	vk/meta: Add support for copying arbitrary size buffers	2015-05-20 16:20:04 -07:00
Jason Ekstrand	9557b85e3d	vk/meta: Use the biggest format possible for buffer copies This should substantially improve throughput of buffer copies.	2015-05-20 16:20:04 -07:00
Jason Ekstrand	13719e9225	vk/meta: Fix buffer copy extents	2015-05-20 16:20:04 -07:00
Jason Ekstrand	d7044a19b1	vk/meta: Use texture() instead of texture2D()	2015-05-19 12:44:35 -07:00
Jason Ekstrand	edff076188	vk: Use binding instead of index in uniform layout qualifiers This more closely matches what the Vulkan docs say to do.	2015-05-19 12:44:22 -07:00
Jason Ekstrand	e37a89136f	vk/glsl_scraper: Add a --glsl-only option	2015-05-19 11:29:07 -07:00
Jason Ekstrand	4bcf58a192	vk/glsl_scraper: Use the line number from the end of the macro We used to use the line number from the start of the macro but this doesn't seem to match the c preprocessor	2015-05-19 11:29:07 -07:00
Jason Ekstrand	1573913194	vk/glsl_scraper: Don't open files until needed This prevents us from writing an empty file when the compile failed.	2015-05-19 11:29:07 -07:00
Kristian Høgsberg	e4c11f50b5	vk: Call finish for binding table state stream	2015-05-18 21:12:13 -07:00
Jason Ekstrand	851495d344	vk/meta: Use the new *view_init functions and stack-allocated views This should save us a good deal of the leakage that meta currently has.	2015-05-18 20:57:43 -07:00
Jason Ekstrand	4668bbb161	vk/image: Factor view creation out into separate _init functions The _init functions work basically the same as the Vulkan entrypoints except that they act on an already-created view and take an optional command buffer option. If a command buffer is given, the surface state is allocated out of the command buffer's state stream.	2015-05-18 20:57:43 -07:00
Jason Ekstrand	7c9f209427	Revert "vk/allocator: Don't use memfd when valgrind is detected" This reverts commit `b6ab076d6b`. It turns out setting USE_MEMFD to 0 is really bad because it means we can't resize the pool. Besides, valgrind SVN handles memfd so we really don't need this fallback for valgrind anymore.	2015-05-18 20:57:43 -07:00
Jason Ekstrand	923691c70d	vk: Use a separate block pool and state stream for binding tables The binding table pointers packet only allows for a 16-bit binding table address so all binding tables have to be in the first 64 KB of the surface state BO. We solve this by adding a slave block pool that pulls off the first 64 KB worth of blocks and reserves them for binding tables.	2015-05-18 20:57:43 -07:00
Jason Ekstrand	d24f8245db	vk/allocator: Add a concept of a slave block pool We probably need a better name but this will do for now.	2015-05-18 20:57:43 -07:00
Kristian Høgsberg	997596e4c4	vk/test: Add test that prints format features	2015-05-18 20:52:44 -07:00
Kristian Høgsberg	241b59cba0	vk/test: Test timestamps and occlusion queries	2015-05-18 20:52:44 -07:00
Kristian Høgsberg	ae9ac47c74	vk: Make timestamp command work correctly This was using the wrong timestamp register and needs to write a 64 bit value.	2015-05-18 20:52:43 -07:00
Kristian Høgsberg	82ddab4b18	vk: Make occlusion query work, both copy and get functions	2015-05-18 20:52:43 -07:00
Kristian Høgsberg	1d40e6ade8	vk: Update generated header files This fixes a problem where register addresses where incorrectly shifted.	2015-05-18 20:52:43 -07:00
Kristian Høgsberg	f330bad545	vk: Only fill render targets for meta clear Clear inherits the render targets from the current render pass. This means we need to fill out the binding table after switching to meta bindings. However, meta copies etc happen outside a render pass and break when we try to fill in the render targets. This change fills the render targets only for meta clear.	2015-05-18 20:52:43 -07:00
Jason Ekstrand	b6c7d8c911	vk/pipeline: Use a state_stream for storing programs Previously, we were effectively using a state_stream, it was just hand-rolled based on a block pool. Now we actually use the data structure.	2015-05-18 15:58:20 -07:00
Jason Ekstrand	4063b7deb8	vk/allocator: Add support for valgrind tracking of state pools and streams We leave the block pool untracked so that reads/writes to freed blocks will get caught and do the tracking at the state pool/stream level. We have to do a few extra gymnastics for streams because valgrind works in terms of poitners and we work in terms of separate map and offset. Fortunately, the users of the state pool and stream should always be using the map pointer provided in the anv_state structure. We just have to track, per block, the map that was used when we initially got the block. Then we can make sure we always use that map and valgrind should stay happy.	2015-05-18 15:58:20 -07:00
Jason Ekstrand	b6ab076d6b	vk/allocator: Don't use memfd when valgrind is detected	2015-05-18 15:58:20 -07:00
Jason Ekstrand	682d11a6e8	vk/allocator: Assert that block_pool_grow succeeds	2015-05-18 15:48:19 -07:00
Jason Ekstrand	28804fb9e4	vk/gem: VG_CLEAR the padding for the gem_mmap struct	2015-05-18 12:05:17 -07:00
Jason Ekstrand	8440b13f55	vk/meta: Rework the indentation style No functional change.	2015-05-18 10:43:51 -07:00
Kristian Høgsberg	5286ef7849	vk: Provide more realistic values for device info	2015-05-18 10:27:08 -07:00
Kristian Høgsberg	69fd473321	vk: Use a temporary buffer for formatting in finishme This is more likely to avoid breaking up the message when racing with other threads.	2015-05-18 10:27:08 -07:00
Jason Ekstrand	cd7ab6ba4e	vk/meta: Add an initial implementation of vkCmdCopyBuffer Compile-tested only	2015-05-18 10:27:08 -07:00
Jason Ekstrand	c25ce55fd3	vk/meta: Add an initial implementation of vkCmdCopyBufferToImage Compile-tested only	2015-05-18 10:27:08 -07:00
Jason Ekstrand	08bd554cda	vk/meta: Add an initial implementation of vkCmdBlitImage Compile-tested only	2015-05-18 10:27:08 -07:00
Jason Ekstrand	fb27d80781	vk/meta: Add an initial implementation of vkCmdCopyImage Compile-tested only	2015-05-18 10:27:08 -07:00
Jason Ekstrand	c15f3834e3	vk/gem: Set the gem_mmap.flags parameter to 0 if it exists	2015-05-18 10:27:08 -07:00
Jason Ekstrand	f7b0f922be	vk/gem: Only VK_CLEAR the addr_ptr in gen_mmap	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	ca7e62d421	vk: Add a logger wrapper for the generated entrypoint	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	eb92745b2e	vk/gem: Just return -1 from anv_gem_wait() on error We were returning -errno, unlike all the other gem functions.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	05754549e8	vk: Fix vkGetOjectInfo return values We weren't properly returning the allocation count.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	6afb26452b	vk: Implement fences This basic implementation uses a throw-away bo for synchronization.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	e26a7ffbd9	vk/meta: Use anv_* internal entrypoints	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	b7fac7a7d1	vk: Implement allocation count query	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	783e6217fc	vk: Change pData/pDataSize semantics We now always copy the entire struct unless pData is NULL and unconditionally write back the struct size. It's not clear this is useful if the structs may grow over time, but it seems to be the expected behaviour for now.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	b4b3bd1c51	vk: Return VK_SUCCESS from vkAllocDescriptorSets This should've been returning VK_SUCCESS all along.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	a9f2115486	vk: Return VK_SUCCESS for all descriptor pool entry points	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	60ebcbed54	vk: Start Implementing vkGetFormatInfo() We move the format table and vkGetFormatInfo to their own file in the process.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	454345da1e	vk: Add script for generating ifunc entry points This lets us generate a hash table for vkGetProcAddress and lets us call public functions internally without the public entrypoint overhead.	2015-05-18 10:27:02 -07:00
Kristian Høgsberg	333bcc2072	vk: Fix vulkan header inconsistency The function pointer typedef and the function prototype for vkCmdClearColorImage() didn't agree. Fix the typedef to match the prototype.	2015-05-17 21:08:31 -07:00
Kristian Høgsberg	b9eb56a404	vk: Add function pointer typedef for intel extension Also guard function prototype by VK_PROTOTYPES.	2015-05-17 21:08:30 -07:00
Kristian Høgsberg	75cb85c56a	vk: Add missing VKAPI for vkQueueRemoveMemReferences	2015-05-17 21:08:30 -07:00
Jason Ekstrand	a924ea0c75	Merge remote-tracking branch 'fdo-personal/wip/nir-vtn' into vulkan This adds the SPIR-V -> NIR translator.	2015-05-16 12:43:16 -07:00
Jason Ekstrand	a63952510d	nir/spirv: Don't assert that the current block is empty It's possible that someone will give us SPIR-V code in which someone needlessly branches to new blocks. We should handle that ok now.	2015-05-16 12:34:34 -07:00
Jason Ekstrand	4e44dcc312	nir/spirv: Add initial support for samplers	2015-05-16 12:34:15 -07:00
Jason Ekstrand	d6f52dfb3e	nir/spirv: Move Exp and Log to the list of currently unhandled ALU ops NIR doesn't have the native opcodes for them anymore	2015-05-16 12:33:32 -07:00
Jason Ekstrand	a53e795524	nir/types: Add support for sampler types	2015-05-16 12:32:58 -07:00
Jason Ekstrand	0fa9211d7f	nir/spirv: Make the global constants in spirv.h static I've been promissed in a bug that this will be fixed in a future version of the header. However, in the interest of my branch building, I'm adding these changes in myself for the moment.	2015-05-16 11:16:34 -07:00
Jason Ekstrand	036a4b1855	nir/spirv: Handle jump-to-loop in a more general way	2015-05-16 11:16:34 -07:00
Jason Ekstrand	56f533b3a0	nir/spirv: Handle boolean uniforms correctly	2015-05-16 11:16:34 -07:00
Jason Ekstrand	64bc58a88e	nir/spirv: Handle control-flow with loops	2015-05-16 11:16:34 -07:00
Jason Ekstrand	3a2db9207d	nir/spirv: Set a name on temporary variables	2015-05-16 11:16:34 -07:00
Jason Ekstrand	a28f8ad9f1	nir/spirv: Use the correct length for copying string literals	2015-05-16 11:16:34 -07:00
Jason Ekstrand	7b9c29e440	nir/spirv: Make vtn_ssa_value handle constants as well as ssa values	2015-05-16 11:16:33 -07:00
Jason Ekstrand	b0d1854efc	nir/spirv: Add initial support for GLSL 4.50 builtins	2015-05-16 11:16:33 -07:00
Jason Ekstrand	1da9876486	nir/spirv: Split the core datastructures into a header file	2015-05-16 11:16:33 -07:00
Jason Ekstrand	98d78856f6	nir/spirv: Use the builder for all instructions We don't actually use it to create all the instructions but we do use it for insertion always. This should make things far more consistent for implementing extended instructions.	2015-05-16 11:16:33 -07:00
Jason Ekstrand	ff828749ea	nir/spirv: Add support for a bunch of ALU operations	2015-05-16 11:16:33 -07:00
Jason Ekstrand	d2a7972557	nir/spirv: Add support for indirect array accesses	2015-05-16 11:16:33 -07:00
Jason Ekstrand	683c99908a	nir/spirv: Explicitly type constants and SSA values	2015-05-16 11:16:33 -07:00
Jason Ekstrand	c5650148a9	nir/spirv: Handle OpBranchConditional We do control-flow handling as a two-step process. The first step is to walk the instructions list and record various information about blocks and functions. This is where the acutal nir_function_overload objects get created. We also record the start/stop instruction for each block. Then a second pass walks over each of the functions and over the blocks in each function in a way that's NIR-friendly and actually parses the instructions.	2015-05-16 11:16:33 -07:00
Jason Ekstrand	ebc152e4c9	nir/spirv: Add a helper for getting a value as an SSA value	2015-05-16 11:16:33 -07:00
Jason Ekstrand	f23afc549b	nir/spirv: Split instruction handling into preamble and body sections	2015-05-16 11:16:33 -07:00
Jason Ekstrand	ae6d32c635	nir/spirv: Implement load/store instructiosn	2015-05-16 11:16:33 -07:00
Jason Ekstrand	88f6fbc897	nir: Add a helper for getting the tail of a deref chain	2015-05-16 11:16:33 -07:00
Jason Ekstrand	06acd174f3	nir/spirv: Actaully add variables to the funciton or shader	2015-05-16 11:16:33 -07:00
Jason Ekstrand	5045efa4aa	nir/spirv: Add a vtn_untyped_value helper	2015-05-16 11:16:33 -07:00
Jason Ekstrand	01f3aa9c51	nir/spirv: Use vtn_value in the types code and fix a off-by-one error	2015-05-16 11:16:33 -07:00
Jason Ekstrand	6ff0830d64	nir/types: Add an is_vector_or_scalar helper	2015-05-16 11:16:33 -07:00
Jason Ekstrand	5acd472271	nir/spirv: Add support for deref chains	2015-05-16 11:16:33 -07:00
Jason Ekstrand	7182597e50	nir/types: Add a scalar type constructor	2015-05-16 11:16:32 -07:00
Jason Ekstrand	eccd798cc2	nir/spirv: Add support for OpLabel	2015-05-16 11:16:32 -07:00
Jason Ekstrand	a6cb9d9222	nir/spirv: Add support for declaring functions	2015-05-16 11:16:32 -07:00
Jason Ekstrand	8ee23dab04	nir/types: Add accessors for function parameter/return types	2015-05-16 11:16:32 -07:00
Jason Ekstrand	707b706d18	nir/spirv: Add support for declaring variables Deref chains and variable load/store operations are still missing.	2015-05-16 11:16:32 -07:00
Jason Ekstrand	b2db85d8e4	nir/spirv: Add support for constants	2015-05-16 11:16:32 -07:00
Jason Ekstrand	3f83579664	nir/spirv: Add basic support for types	2015-05-16 11:16:32 -07:00
Jason Ekstrand	e9d3b1e694	nir/types: Add more helpers for creating types	2015-05-16 11:16:32 -07:00
Jason Ekstrand	fe550f0738	glsl/types: Expose the function_param and struct_field structs to C Previously, they were hidden behind a #ifdef __cplusplus so C wouldn't find them. This commit simpliy moves the ifdef.	2015-05-16 11:16:32 -07:00
Jason Ekstrand	053778c493	glsl/types: Add support for function types	2015-05-16 11:16:32 -07:00
Jason Ekstrand	7b63b3de93	glsl: Add GLSL_TYPE_FUNCTION to the base types enums	2015-05-16 11:16:32 -07:00
Jason Ekstrand	2b570a49a9	nir/spirv: Rework the way values are added Instead of having functions to add values and set various things, we just have a function that does a few asserts and then returns the value. The caller is then responsible for setting the various fields.	2015-05-16 11:16:32 -07:00
Jason Ekstrand	f9a31ba044	nir/spirv: Add stub support for extension instructions	2015-05-16 11:16:32 -07:00
Jason Ekstrand	4763a13b07	REVERT: Add a simple helper program for testing SPIR-V -> NIR translation	2015-05-16 11:16:32 -07:00
Jason Ekstrand	cae8db6b7e	glsl/compiler: Move the error_no_memory stub to standalone_scaffolding.cpp	2015-05-16 11:16:32 -07:00
Jason Ekstrand	98452cd8ae	nir: Add the start of a SPIR-V to NIR translator At the moment, it can handle the very basics of strings and can ignore debug instructions. It also has basic support for decorations.	2015-05-16 11:16:32 -07:00
Jason Ekstrand	573ca4a4a7	nir: Import the revision 30 SPIR-V header from Khronos	2015-05-16 11:16:31 -07:00
Jason Ekstrand	057bef8a84	vk/device: Use bias rather than layers for computing binding table size Because we statically use the first 8 binding table entries for render targets, we need to create a table of size 8 + surfaces.	2015-05-16 10:42:53 -07:00
Jason Ekstrand	22e61c9da4	vk/meta: Make clear a no-op if no layers need clearing Among other things, this prevents recursive meta.	2015-05-16 10:30:05 -07:00
Jason Ekstrand	120394ac92	vk/meta: Save and restore the old bindings pointer If we don't do this then recursive meta is completely broken. What happens is that the outer meta call may change the bindings pointer and the inner meta call will change it again and, when it exits set it back to the default. However, the outer meta call may be relying on it being left alone so it uses the non-meta descriptor sets instead of its own.	2015-05-16 10:28:04 -07:00
Jason Ekstrand	4223de769e	vk/device: Simplify surface_count calculation	2015-05-16 10:23:09 -07:00
Jason Ekstrand	eb1952592e	vk/glsl_helpers: Fix GLSL_VK_SHADER with respect to commas Previously, the GLSL_VK_SHADER macro didn't work if the shader contained commas outside of parentheses due to the way the C preprocessor works. This commit fixes this by making it variadic again and doing it correctly this time.	2015-05-15 22:17:07 -07:00
Kristian Høgsberg	3b9f32e893	vk: Make cmd_buffer->bindings a pointer This lets us save and restore efficiently by just moving the pointer to a temporary bindings struct for meta.	2015-05-15 18:12:07 -07:00
Kristian Høgsberg	9540130c41	vk: Move vertex buffers into struct anv_bindings	2015-05-15 16:34:31 -07:00
Kristian Høgsberg	0cfc493775	vk: Fix GLSL_VK_SHADER macro Stringify doesn't work with __ARGV__. The last macro argument swallows up excess arguments and as such we can just stringify that.	2015-05-15 16:15:04 -07:00
Kristian Høgsberg	af45f4a558	vk: Fix warning from missing initializer Struct initializers need to be { 0, } to zero out the variable they're initializing.	2015-05-15 16:07:17 -07:00
Kristian Høgsberg	bf096c9ec3	vk: Build binding tables at bind descriptor time This changes the way descriptor sets and layouts work so that we fill out binding table contents at the time we bind descriptor sets. We manipulate the binding table contents and sampler state in a shadow-copy in anv_cmd_buffer. At draw time, we allocate the actual binding table and sampler state and flush the anv_cmd_buffer copies.	2015-05-15 16:05:31 -07:00
Kristian Høgsberg	1f6c220b45	vk: Update the bind map length to reflect MAX_SETS	2015-05-15 15:22:29 -07:00
Kristian Høgsberg	b806e80e66	vk: Flip back to using memfd for the allocators	2015-05-15 15:22:29 -07:00
Kristian Høgsberg	0a775e1eab	vk: Rename dyn_state_pool to dynamic_state_pool Given that we already tolerate surface_state_pool and the even longer instruction_state_pool, there's no reason to arbitrarily abbreviate dynamic.	2015-05-15 15:22:29 -07:00
Kristian Høgsberg	f5b0f1351f	vk: Consolidate image, buffer and color attachment views These are all just surface state, offset and a bo.	2015-05-15 15:22:29 -07:00
Jason Ekstrand	41db8db0f2	vk: Add a GLSL scraper utility This new utility, glsl_scraper.py scrapes C files for instances of the GLSL_VK_SHADER macro, pulls out the shader source, and compiles it to SPIR-V. The compilation is done using glslValidator. The result is then placed into another C file as arrays of dwords that can be easiliy handed to a Vulkan driver.	2015-05-14 19:18:57 -07:00
Jason Ekstrand	79ace6def6	vk/meta: Add a magic GLSL shader source macro	2015-05-14 19:07:34 -07:00
Jason Ekstrand	018a0c1741	vk/meta: Add a better comment about the VS for blits	2015-05-14 11:39:32 -07:00
Jason Ekstrand	8c92701a69	vk/test: Use VK_IMAGE_TILING_OPTIMAL for the render target	2015-05-13 22:27:38 -07:00
Jason Ekstrand	4fb8bddc58	vk/test: Do a copy of the RT into a linear buffer and write that to a PNG	2015-05-13 22:23:30 -07:00
Jason Ekstrand	bd5b76d6d0	vk/meta: Add the start of a blit implementation Currently, we only implement CopyImageToBuffer	2015-05-13 22:23:30 -07:00
Jason Ekstrand	94b8c0b810	vk/pipeline: Default to a SamplerCount of 1 for PS	2015-05-13 22:23:30 -07:00
Jason Ekstrand	d3d4776202	vk/pipeline: Add an extra flag for force-disabling the vertex shader This way we can pass in a vertex shader and yet have the pipeline emit an empty 3DSTATE_VS packet. We need this for meta because we need to trick the compiler into not deleting our inputs but at the same time disable the VS so that we can use a rectlist. This should go away once we actually get SPIR-V.	2015-05-13 22:23:30 -07:00
Jason Ekstrand	a1309c5255	vk/pass: Emit a flushing pipe control at the end of the pass This is rather crude but it at least makes sure that all the render targets get flushed at the end of the pass. We probably actually want to do somthing based on image layout traansitions, but this will work for now.	2015-05-13 22:23:30 -07:00
Jason Ekstrand	07943656a7	vk/compiler: Set the binding table texture_start This is by no means a complete solution to the binding table problems. However, it does make texturing actually work. Before, we were texturing from the render target since they were both starting at 0.	2015-05-13 22:23:30 -07:00
Jason Ekstrand	cd197181f2	vk/compiler: Zero the prog data We use prog_data[stage] != NULL to determine whether or not we need to clean up that stage. Make sure it default to NULL.	2015-05-13 22:22:59 -07:00
Jason Ekstrand	1f7dcf9d75	vk/image: Stash more information in images and views	2015-05-13 22:22:59 -07:00
Jason Ekstrand	43126388cd	vk/meta: Save/restore more stuff in cmd_buffer_restore	2015-05-13 22:22:59 -07:00
Chad Versace	50806e8dec	vk: Install headers I need this for building a testsuite.	2015-05-13 17:49:26 -07:00
Kristian Høgsberg	83c7e1f1db	vk: Add support for sampler descriptors	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	4f9eaf77a5	vk: Use a typesafe anv_descriptor struct	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	5c9d77600b	vk: Create and bind a sampler in vk.c	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	18acfa7301	vk: Fix copy-n-paste sType in vkCreateSampler	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	a1ec789b0b	vk: Add a dynamic state stream to anv_cmd_buffer We'll need this for sampler state.	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	3f52c016fa	vk: Move struct anv_sampler to private.h	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	a77229c979	vk: Allocate layout->count number of descriptors layout->count is the number of descriptors the application requested. layout->total is the number of entries we need across all stages.	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	a3fd136509	vk: Fill out sampler state from API values	2015-05-13 14:47:11 -07:00
Chad Versace	828817b88f	vk: Ignore vk executable	2015-05-13 12:05:38 -07:00
Kristian Høgsberg	2b7a060178	vk: Fix stale error handling in vkQueueSubmit	2015-05-12 14:38:58 -07:00
Kristian Høgsberg	cb986ef597	vk: Submit all cmd buffers passed to vkQueueSubmit	2015-05-12 14:38:12 -07:00
Kristian Høgsberg	9905481552	vk: Add generated header for HSW and IVB (GEN75 and GEN7)	2015-05-12 14:29:04 -07:00
Jason Ekstrand	ffe9f60358	vk: Add stub() and stub_return() macros and mark piles of functions as stubs	2015-05-12 13:45:02 -07:00
Jason Ekstrand	d3b374ce59	vk/util: Add a anv_finishme function/macro	2015-05-12 13:43:36 -07:00
Jason Ekstrand	7727720585	vk/meta: Break setting up meta clear state into it's own functin	2015-05-12 13:03:50 -07:00
Jason Ekstrand	4336a1bc00	vk/pipeline: Add support for disabling the scissor in "extra"	2015-05-12 12:53:01 -07:00
Kristian Høgsberg	d77c34d1d2	vk: Add clear load-op for render passes	2015-05-11 23:25:29 -07:00
Kristian Høgsberg	b734e0bcc5	vk: Add support for driver-internal custom pipelines This lets us disable the viewport, use rect lists and repclear.	2015-05-11 23:25:29 -07:00
Kristian Høgsberg	ad132bbe48	vk: Fix 3DSTATE_VERTEX_BUFFER emission Set VertexBufferIndex to the attribute binding, not the location.	2015-05-11 23:25:29 -07:00
Kristian Høgsberg	6a895c6681	vk: Add 32 bpc signed and unsigned integer formats	2015-05-11 23:25:29 -07:00
Kristian Høgsberg	55b9b703ea	vk: Add anv_batch_emit_merge() helper macro This lets us emit a state packet by merging to half-backed versions, typically one from the pipeline object and one from a dynamic state objects.	2015-05-11 23:25:28 -07:00
Kristian Høgsberg	099faa1a2b	vk: Store bo pointer in anv_image and anv_buffer We don't need to point back to the memory object the bo came from. Pointing directly to a bo lets us bind images and buffers to other bos - like our allocator bos.	2015-05-11 23:25:28 -07:00
Kristian Høgsberg	4f25f5d86c	vk: Support not having a vertex shader This lets us bypass the vertex shader and pass data straight into the rasterizer part of the pipeline.	2015-05-11 23:25:28 -07:00
Kristian Høgsberg	20ad071190	vk: Allow NULL as a valid pipeline layout Vertex buffers and render targets aren't part of the layout so having an empty layout is pretty common.	2015-05-11 22:12:56 -07:00
Kristian Høgsberg	769785c497	Add vulkan driver for BDW	2015-05-09 11:38:32 -07:00

1860 changed files with 202072 additions and 41489 deletions

3

.gitignore vendored

View File

@@ -34,6 +34,7 @@ aclocal.m4
 config.log
 config.status
 cscope*
 tags
 .scon*
 config.py
 build
@@ -46,3 +47,5 @@ manifest.txt
 Makefile
 Makefile.in
 .install-mesa-links
 .install-gallium-links
 /src/git_sha1.h

460

.mailmap Normal file

View File

@@ -0,0 +1,460 @@
 Aapo Tahkola <aet@rasterburn.org> <aapo@aapo-desktop.(none)>
 Adam Jackson <ajax@redhat.com> <ajax@benzedrine.nwnk.net>
 Adam Jackson <ajax@redhat.com> <ajax@freedesktop.org>
 Adrian Marius Negreanu <adrian.m.negreanu@intel.com> Adrian Negreanu <adrian.m.negreanu@intel.com>
 Adrian Marius Negreanu <adrian.m.negreanu@intel.com> Negreanu Marius Adrian <adrian.m.negreanu@intel.com>
 Dave Airlie <airlied@redhat.com> <airliedfreedesktop.org>
 Dave Airlie <airlied@redhat.com> airlied <airlied@unused-12-215.bne.redhat.com>
 Dave Airlie <airlied@redhat.com> <airlied@dhcp-1-203.bne.redhat.com>
 Dave Airlie <airlied@redhat.com> <airlied@gmail.com>
 Dave Airlie <airlied@redhat.com> <airlied@itt42.(none)>
 Dave Airlie <airlied@redhat.com> <airlied@linux.ie>
 Dave Airlie <airlied@redhat.com> <airlied@nx6125b.(none)>
 Dave Airlie <airlied@redhat.com> <airlied@panoply-rh.(none)>
 Dave Airlie <airlied@redhat.com> <airlied@ppcg5.localdomain>
 Alan Coopersmith <alan.coopersmith@oracle.com> <alan.coopersmith@sun.com>
 Alan Hourihane <alanh@vmware.com> <alanh@tungstengraphics.com>
 Alan Hourihane <alanh@vmware.com> <alanh@fairlite.demon.co.uk>
 Alan Hourihane <alanh@vmware.com> <alanh@jetpack.(none)>
 Alexander Monakov <amonakov@gmail.com> <amonakov@ispras.ru>
 Alexander von Gluck IV <kallisti5@unixzen.com> Alexander von Gluck <kallisti5@unixzen.com>
 Alex Corscadden <alexc@vmware.com> <alexc@alexc-dev1.prom.eng.vmware.com>
 Alex Corscadden <alexc@vmware.com> <alexc@alexc-dev1.vmware.com>
 Alex Deucher <alexdeucher@gmail.com> <alexander.deucher@amd.com>
 Alex Deucher <alexdeucher@gmail.com> <agd5f@yahoo.com>
 Alex Deucher <alexdeucher@gmail.com> <alex@botch2.com>
 Alex Deucher <alexdeucher@gmail.com> <alex@botch2.(none)>
 Alex Deucher <alexdeucher@gmail.com> <alex@cube.(none)>
 Alex Deucher <alexdeucher@gmail.com> <alex@samba.(none)>
 Andreas Fänger <a.faenger@e-sign.com> <a.faenger@e-sign.com>
 Andreas Hartmetz <ahartmetz@gmail.com> <andreas.hartmetz@kdab.com>
 Andre Heider <a.heider@gmail.com>
 Andreas Heider <andreas@heider.io>
 Andreas Pokorny <andreas.pokorny@canonical.com> <andreas.pokorny@elektrobit.com>
 Andrew Randrianasulu <randrianasulu@gmail.com> <randrik_a@yahoo.com>
 Andrew Randrianasulu <randrianasulu@gmail.com> <randrik@mail.ru>
 Arthur Huillet <arthur.huillet@free.fr> Arthur HUILLET <arthur.huillet@free.fr>
 Benjamin Franzke <benjaminfranzke@googlemail.com> ben <benjaminfranzke@googlemail.com>
 Ben Skeggs <bskeggs@redhat.com> <darktama@beleth.(none)>
 Ben Skeggs <bskeggs@redhat.com> <darktama@iinet.net.au>
 Ben Skeggs <bskeggs@redhat.com> <darktama@nisroch.keine.ath.cx>
 Ben Skeggs <bskeggs@redhat.com> <skeggsb-at-gmail.com>
 Ben Skeggs <bskeggs@redhat.com> <skeggsb@gmail.com>
 Ben Skeggs <bskeggs@redhat.com> <skeggsb@localhost.localdomain>
 Ben Skeggs <bskeggs@redhat.com> <skeggsb@nisroch.keine.ath.cx>
 Ben Widawsky <benjamin.widawsky@intel.com> Ben Widawsky <ben@bwidawsk.net>
 Blair Sadewitz <blair.sadewitz@gmail.com> Blair Sadewitz <blair.sadewitz.gmail.com>
 Boris Peterbarg <reist@users.sourceforge.net> reist <reist>
 Brian Paul <brianp@vmware.com> Brian <brian.paul@tungstengraphics.com>
 Brian Paul <brianp@vmware.com> <brian.paul@tungstengraphics.com>
 Brian Paul <brianp@vmware.com> <brian.e.paul@gmail.com>
 Brian Paul <brianp@vmware.com> <brianp@kemper.freedesktop.org>
 Brian Paul <brianp@vmware.com> brian <brian@cvp965.(none)>
 Brian Paul <brianp@vmware.com> Brian <brian@i915.localnet.net>
 Brian Paul <brianp@vmware.com> Brian <brian@nostromo.localnet.net>
 Brian Paul <brianp@vmware.com> Brian <brian@poulsbo.localnet.net>
 Brian Paul <brianp@vmware.com> Brian <brian@ps3.localnet.net>
 Brian Paul <brianp@vmware.com> Brian <brianp@vmware.com>
 Brian Paul <brianp@vmware.com> Brian <brian@yutani.localnet.net>
 Brian Paul <brianp@vmware.com> root <brian.paul@tungstengraphics.com>
 Brian Paul <brianp@vmware.com> root <root@i915.localnet.net>
 Brian Paul <brianp@vmware.com> root <root@nostromo.localnet.net>
 Brian Paul <brianp@vmware.com> root <root@i965.localnet.net>
 Bruce Merry <bmerry@users.sourceforge.net> <bmerry@gmail.com>
 Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <s3734770@mail.zih.tu-dresden.de>
 Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <carli@carli-laptop.(none)>
 Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <Carl-Philip.Haensch@mailbox.tu-dresden.de>
 Chad Versace <chad.versace@intel.com> <chad@chad-versace.us>
 Chad Versace <chad.versace@intel.com> <Chad Versace chad@chad-versace.us>
 Chad Versace <chad.versace@intel.com> <chad.versace@linux.intel.com>
 Chia-I Wu <olvaffe@gmail.com> <olv@lunarg.com>
 Chia-I Wu <olvaffe@gmail.com> Chia-Wu <olvaffe@gmail.com>
 Chih-Wei Huang <cwhuang@linux.org.tw> Chih-Wei Huang <cwhuang@android-x86.org>
 Christian König <christian.koenig@amd.com> Christian Koenig <christian.koenig@amd.com>
 Christian König <christian.koenig@amd.com> Christian König <christian.koenig at amd.com>
 Christian König <christian.koenig@amd.com> Christian König <deathsimple@vodafone.de>
 Christoph Brill <egore911@egore911.de> Christoph Bill <egore@gmx.de>
 Christoph Brill <egore911@egore911.de> <egore@gmx.de>
 Christoph Bumiller <christoph.bumiller@speed.at> <e0425955@student.tuwien.ac.at>
 Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Christopher James Halse Rogers <raof@ubuntu.com>
 Claudio Ciccani <klan@directfb.org> <klan@users.sf.net>
 Claudio Ciccani <klan@directfb.org> <klan@users.sourceforge.net>
 Connor Abbott <cwabbott0@gmail.com> <connor.w.abbott@intel.com>
 Connor Abbott <cwabbott0@gmail.com> <connor.abbott@intel.com>
 Corbin Simpson <MostAwesomeDude@gmail.com> <mostawesomed...@gmail.com>
 Corbin Simpson <MostAwesomeDude@gmail.com> <mostawesomedude@gmail.com>
 Courtney Goeltzenleuchter <courtney@lunarg.com> <courtney@LunarG.com>
 Daniel Skinner <sio@users.sourceforge.net> sio <sio>
 Daniel Stone <daniels@collabora.com> <daniel@fooishbar.org>
 David Miller <davem@davemloft.net> David S. Miller <davem@davemloft.net>
 David Miller <davem@davemloft.net> Dave Miller <davem@davemloft.net>
 David Miller <davem@davemloft.net> davem69 <davem69>
 David Heidelberger <david.heidelberger@ixit.cz> David Heidelberg <david@ixit.cz>
 David Heidelberger <david.heidelberger@ixit.cz> <d.okias@gmail.com>
 David Reveman <reveman@chromium.org> <c99drn@cs.umu.se>
 Dieter Nützel <Dieter@nuetzel-hh.de> Dieter Nützel <dieter@nuetzel-hh.de>
 Dmitry Cherkassov <dcherkassov@gmail.com> Dmitry Cherkasov <dcherkassov@gmail.com>
 Dylan Baker <dylanx.c.baker@intel.com> <baker.dylan.c@gmail.com>
 Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com>
 Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
 Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org>
 Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com>
 Fabian Bieler <der.fabe@gmx.net> <fabianbieler@fastmail.fm>
 Fabian Bieler <der.fabe@gmx.net> <&lt;der.fabe@gmx.net&gt>
 Feng, Haitao <haitao.feng@intel.com> Haitao Feng <haitao.feng@intel.com>
 Frank Henigman <fjhenigman@google.com> <fjhenigman@chromium.org>
 George Sapountzis <gsapountzis@gmail.com> George Sapountzis <gsap7@yahoo.gr>
 Gwenole Beauchesne <gwenole.beauchesne@intel.com> <gb.devel@gmail.com>
 Hamish Marson <hmarson@users.sourceforge.net> hmarson <hmarson>
 Hans de Goede <hdegoede@redhat.com> Hans de Goede <j.w..r..degoede@hhs.nl>
 Homer Hsing <dongsheng.xing@intel.com> <homer.hsing@gmail.com>
 Hui Qi Tay <hqtay@vmware.com> <tayhuiqithq@gmail.com>
 Ian Romanick <ian.d.romanick@intel.com> <idr@freedesktop.org>
 Ian Romanick <ian.d.romanick@intel.com> <idr@us.ibm.com>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@vmware.com>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@aurora.(none)>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@aurora.walkyrie.se>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@tungstengraphics.com>
 Jakob Bornecrantz <wallbraker@gmail.com> <wallbraker 'at' gmail 'dot' com>
 Jakub Bogusz <qboosh@pld-linux.org> <gboosh@pld-linux.org>
 James Legg <jlegg@feralinteractive.com> <lankyleggy@gmail.com>
 Jan Vesely <jano.vesely@gmail.com> Jan Vesely <jan.vesely@rutgers.edu>
 Jason Ekstrand <jason@jlekstrand.net> <jason.ekstrand@intel.com>
 Jeremy Huddleston <jeremyhu@apple.com> <jeremyhu@freedesktop.org>
 Jeremy Huddleston <jeremyhu@apple.com> <jeremy@tifa.local>
 Jeremy Huddleston <jeremyhu@apple.com> <jeremy@vincent.local>
 Jeremy Huddleston <jeremyhu@apple.com> <jeremy@yuffie.local>
 Jeremy Huddleston <jeremyhu@apple.com> Jeremy Huddleston Sequoia <jeremyhu@apple.com>
 Jeremy Kolb <jkolb@freedesktop.org> <jkolb@brandeis.edu>
 Jerome Glisse <jglisse@redhat.com> <glisse@freedesktop.org>
 Jerome Glisse <jglisse@redhat.com> <glisse@kemper.freedesktop.org>
 Jerome Glisse <jglisse@redhat.com> John Doe <glisse@barney.(none)>
 Jerome Glisse <jglisse@redhat.com> John Doe <glisse@localhost.localdomain>
 Jesse Barnes <jesse.barnes@intel.com> <jbarnes@hobbes.lan>
 Jesse Barnes <jesse.barnes@intel.com> <jbarnes@hobbes.(none)>
 Jesse Barnes <jesse.barnes@intel.com> <jbarnes@jbarnes-desktop.localdomain>
 Jesse Barnes <jesse.barnes@intel.com> <jbarnes@jbarnes-t61.(none)>
 Jesse Barnes <jesse.barnes@intel.com> <jbarnes@virtuousgeek.org>
 Joakim Sindholt <bacn@zhasha.com> <opensource@zhasha.com>
 Joakim Sindholt <bacn@zhasha.com> <zhasha@gallium-dev.(none)>
 Jochen Gerlach <jtg@users.sourceforge.net> jtg <jtg>
 Joel Bosveld <joel.bosveld@gmail.com> <Joel.Bosveld@gmail.com>
 Jonathan Adamczewski <jadamcze@utas.edu.au> <jadamcze@utas.edu.a>
 Jon Turney <jon.turney@dronecode.org.uk> Jon TURNEY <jon.turney@dronecode.org.uk>
 José Fonseca <jfonseca@vmware.com> Jose Fonseca <jfonseca@vmware.com>
 José Fonseca <jfonseca@vmware.com> Jose Fonseca <jrfonseca@tungstengraphics.com>
 José Fonseca <jfonseca@vmware.com> <jfonseca@pegasus.(none)>
 José Fonseca <jfonseca@vmware.com> <jfonseca@titan.(none)>
 José Fonseca <jfonseca@vmware.com> <jose.r.fonseca@gmail.com>
 José Fonseca <jfonseca@vmware.com> <jrfonseca@tungstengraphics.com>
 José Fonseca <jfonseca@vmware.com> <j_r_fonseca@yahoo.co.uk>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk Jansen <jouk@hrem.nano.tudelft.nl>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk Jansen <joukj@hrem.stm.tudelft.nl>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> joukj <joukj@tarantella.(none)>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk <joukj@tarantella.nano.tudelft.nl>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk <joukj@tarantella.(none)>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> J.Jansen <joukj@tarantella.nano.tudelft.nl>
 Juan Zhao <juan.j.zhao@intel.com> <juan.j.zhao@linux.intel.com>
 Julien Cristau <jcristau@debian.org> <julien.cristau@logilab.fr>
 Julien Isorce <j.isorce@samsung.com> <julien.isorce@gmail.com>
 Kalyan Kondapally <kalyan.kondapally@intel.com> <kondapallykalyancontribute@gmail.com>
 Karl Schultz <karl.w.schultz@gmail.com> Karl Schultze <k.w.schultz@comcast.net>
 Karl Schultz <karl.w.schultz@gmail.com> unknown <kwschult@.na.qualcomm.com>
 Karl Schultz <karl.w.schultz@gmail.com> <k.w.schultz@comcast.net>
 Karl Schultz <karl.w.schultz@gmail.com> <Karl.W.Schultz@gmail.com>
 Karl Schultz <karl.w.schultz@gmail.com> <kschultz@freedesktop.org>
 Keith Harrison <sio2@users.sourceforge.net> sio2 <sio2>
 Keith Packard <keithp@keithp.com> <keithp@koto.keithp.com>
 Keith Packard <keithp@keithp.com> <keithp@neko.keithp.com>
 Keith Whitwell <keithw@vmware.com> <keith@tungstengraphics.com>
 Keith Whitwell <keithw@vmware.com> keithw <keithw@keithw-laptop.(none)>
 Kristian Høgsberg <krh@bitplanet.net> <krh@redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <krh@hinata.boston.redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <krh@sasori.boston.redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <krh@temari.boston.redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <kristian.h.kristensen@intel.com>
 Krzesimir Nowak <qdlacz@gmail.com> <krzesimir@kinvolk.io>
 Li Peng <peng.li@intel.com> <peng.li@linux.intel.com>
 Lucas Stach <dev@lynxeye.de> <l.stach@pengutronix.de>
 Maarten Lankhorst <maarten.lankhorst@ubuntu.com> <dev@mblankhorst.nl>
 Maarten Lankhorst <maarten.lankhorst@ubuntu.com> <m.b.lankhorst@gmail.com>
 Maarten Lankhorst <maarten.lankhorst@ubuntu.com> <maarten.lankhorst@canonical.com>
 Maciej Cencora <m.cencora@gmail.com> <maciej@osiris.(none)>
 Marc-André Lureau <marcandre.lureau@gmail.com> Marc-Andre Lureau <marcandre.lureau@gmail.com>
 Marc Dietrich <marvin24@gmx.de> Marc <marvin24@gmx.de>
 Marc Dietrich <marvin24@gmx.de> marvin24 <marvin24@gmx.de>
 Marcin Ślusarz <marcin.slusarz@gmail.com> Marcin Slusarz <marcin.slusarz@gmail.com>
 Marek Olšák <marek.olsak@amd.com> <maraeo@gmail.com>
 Mario Kleiner <mario.kleiner.de@gmail.com> kleinerm <mario.kleiner@tuebingen.mpg.de>
 Mario Kleiner <mario.kleiner.de@gmail.com> <mario.kleiner@tuebingen.mpg.de>
 Mark Mueller <markkmueller@gmail.com> <MarkKMueller@gmail.com>
 Marta Lofstedt <marta.lofstedt@intel.com> <marta.lofstedt@linux.intel.com>
 Martin Peres <martin.peres@linux.intel.com> <martin.peres@labri.fr>
 Mathias Fröhlich <mathias.froehlich@gmx.net> Mathias Froehlich <Mathias.Froehlich@gmx.net>
 Mathias Fröhlich <mathias.froehlich@gmx.net> Mathias Froehlich <Mathias.Froehlich@web.de>
 Mathias Fröhlich <mathias.froehlich@gmx.net> Mathias Frohlich <M.Froehlich@science-computing.de>
 Mathias Fröhlich <mathias.froehlich@gmx.net> <frohlich8@users.sourceforge.net>
 Mathias Fröhlich <mathias.froehlich@gmx.net> <Mathias.Froehlich@gmx.net>
 Mathias Fröhlich <mathias.froehlich@gmx.net> <Mathias.Froehlich@web.de>
 Mathias Fröhlich <mathias.froehlich@gmx.net> M.Froehlich@science-computing.de <M.Froehlich@science-computing.de>
 Matthew W. S. Bell <matthew@bells23.org.uk> Matthew Bell <matthew@bells23.org.uk>
 Maxence Le Doré <maxence.ledore@gmail.com> Maxence Le Dore <maxence.ledore@gmail.com>
 Micah Fedke <micah.fedke@collabora.co.uk> <M.Fedke@Astronautics.com>
 Michal Krol <michal@vmware.com> <michal@tungstengraphics.com>
 Michal Krol <michal@vmware.com> Michal Krol <michal@ubuntu-vbox.(none)>
 Michal Krol <michal@vmware.com> Michal Krol <mjkrol@gmail.org>
 Michal Krol <michal@vmware.com> michal <michal@capacitor.(none)>
 Michal Krol <michal@vmware.com> michal <michal@michal-laptop.(none)>
 Michal Krol <michal@vmware.com> michal <michal@quad.(none)>
 Michal Krol <michal@vmware.com> michal <michal@transistor.(none)>
 Michal Krol <michal@vmware.com> Michal <michal@tungstengraphics.com>
 Michal Krol <michal@vmware.com> michal <michal@wmvare.com>
 Michel Dänzer <michel@daenzer.net> <michel.daenzer@amd.com>
 Michel Dänzer <michel@daenzer.net> <daenzer@vmware.com>
 Michel Dänzer <michel@daenzer.net> <michel@tungstengraphics.com>
 Michel Dänzer <michel@daenzer.net> Michel Daenzer <michel.daenzer@amd.com>
 Michel Dänzer <michel@daenzer.net> Michel Daenzer <daenzer@localhost.(none)>
 Mike Kaplinskiy <mike.kaplinskiy@gmail.com> Mike Kaplinksiy <mike.kaplinskiy@gmail.com>
 Mike Kaplinskiy <mike.kaplinskiy@gmail.com> <mike.kaplinskiy@gmai.com>
 Mike Stroyan <mike@lunarg.com> <mike@LunarG.com>
 Nian Wu <nian.wu@intel.com> <nian@graphics.(none)>
 Nian Wu <nian.wu@intel.com> <nian@tinderbox.sh.intel.com>
 Nick Bowler <nbowler@draconx.ca>
 Nick Sarnie <commendsarnex@gmail.com>
 Nicolai Hähnle <nicolai.haehnle@amd.com> <nhaehnle@gmail.com>
 Nicolai Hähnle <nicolai.haehnle@amd.com> Nicolai Haehnle <nhaehnle@gmail.com>
 Nicolai Hähnle <nicolai.haehnle@amd.com> Nicolai Haehnle <prefect_@gmx.net>
 Nicolai Hähnle <nicolai.haehnle@amd.com> Nicolai Haehnle <prefect@upb.de>
 Nigel Stewart <nigels@users.sourceforge.net> <nigels@sourceforge.net>
 Nigel Stewart <nigels@users.sourceforge.net> <nstewart@nvidia.com>
 nobled <nobled@dreamwidth.org> <nobled2@nobled2-karmic.(none)>
 Oliver McFadden <oliver.mcfadden@linux.intel.com> <z3ro.geek@gmail.com>
 Owain Ainsworth <zerooa@googlemail.com> Owain G. Ainsworth <oga@openbsd.org>
 Owen W. Taylor <otaylor@fishsoup.net> Owen Taylor <otaylor@snell.localdomain>
 Patrice Mandin <patmandin@gmail.com> <patrice@manoir.racoon.city>
 Patrice Mandin <patmandin@gmail.com> <pmandin@caramail.com>
 Patrice Mandin <patmandin@gmail.com> <pmandin@freedesktop.org>
 Pauli Nieminen <pauli.nieminen@linux.intel.com> <suokkos@gmail.com>
 Paulo Zanoni <paulo.r.zanoni@intel.com> Paulo Zanoni <pzanoni@mandriva.com>
 Paul Seidler <sepek@exherbo.org> Paul Seidler <pl.seidler@googlemail.com>
 Pekka Paalanen <pekka.paalanen@collabora.co.uk> <ppaalanen@gmail.com>
 Pekka Paalanen <pekka.paalanen@collabora.co.uk> <pq@iki.fi>
 Peter Hutterer <peter.hutterer@who-t.net> <peter@cs.unisa.edu.au>
 Pierre-Eric Pelloux-Prayer <pelloux@gmail.com> pepp <pelloux@gmail.com>
 Pierre Willenbrock <pierre@pirsoft.de> Pierre Willenbrok <pierre@pirsoft.de>
 Quentin Glidic <sardemff7+git@sardemff7.net> <sardemff7@sardemff7.net>
 RALOVICH, Kristóf <tade60@freemail.hu> <kristof.ralovich@gmail.com>
 Richard Li <richardradeon@gmail.com> <RichardZ.Li@amd.com>
 # The next ones are not 100% sure
 Richard Li <richardradeon@gmail.com> richard <richard@richard-desktop3.(none)>
 Richard Li <richardradeon@gmail.com> richard <richard@richard-desktop.(none)>
 Richard Li <richardradeon@gmail.com> root <root@richard-desktop.(none)>
 Richard Sandiford <rsandifo@linux.vnet.ibm.com> <r.sandiford@uk.ibm.com>
 Rob Clark <robclark@freedesktop.org> <Rob Clark robdclark@freedesktop.org>
 Rob Clark <robclark@freedesktop.org> <robdclark@gmail.com>
 Robert Bragg <robert@sixbynine.org> <robert@linux.intel.com>
 Robert Ellison <papillo@vmware.com> <papillo@i965-laptop.(none)>
 Robert Ellison <papillo@vmware.com> <papillo@tungstengraphics.com>
 Robert Hooker <sarvatt@ubuntu.com> <robert.hooker@canonical.com>
 Roland Scheidegger <sroland@vmware.com> <rscheidegger@gmx.ch>
 Roland Scheidegger <sroland@vmware.com> <sroland@tungstengraphics.com>
 Roy Spliet <rspliet@eclipso.eu> <r.spliet@student.tudelft.nl>
 Rune Petersen <rune@megahurts.dk> Rune Peterson <rune@megahurts.dk>
 Ryan Houdek <sonicadvance1@gmail.com> <Sonicadvance1@gmail.com>
 Sam Hocevar <sam@hocevar.net> Sam Hocevar <sam@zoy.org>
 Samuel Iglesias Gonsálvez <siglesias@igalia.com> Samuel Iglesias Gonsalvez <siglesias@igalia.com>
 Sean D'Epagnier <sean@depagnier.com> <geckosenator@freedesktop.org>
 Serge Martin <edb+mesa@sigluy.net> Serge Martin (EdB) <edb+mesa@sigluy.net>
 Serge Martin <edb+mesa@sigluy.net> EdB <edb+mesa@sigluy.net>
 Sinclair Yeh <syeh@vmware.com> <sinclair.yeh@intel.com>
 Stefan Brüns <stefan.bruens@rwth-aachen.de> <Stefan.Bruens@rwth-aachen.de>
 Stéphane Marchesin <marcheu@chromium.org> Stephane Marchesin <marchesin@icps.u-strasbg.fr>
 Stéphane Marchesin <marcheu@chromium.org> Stephane Marchesin <stephane.marchesin@gmail.com>
 Sven M. Hallberg <pesco@users.sourceforge.net> pesco <pesco>
 Tapani Pälli <tapani.palli@intel.com> <tapani.palli@gmail.com>
 Tapani Pälli <tapani.palli@intel.com> Tapani <tapani.palli@intel.com>
 Thierry Reding <treding@nvidia.com> <thierry@gilfi.de>
 Thierry Reding <treding@nvidia.com> <thierry.reding@avionic-design.de>
 Thierry Vignaud <thierry.vignaud@gmail.com> <tvignaud@mandriva.com>
 Thomas Balling Sørensen <tball@io.dk> <tball@tball-laptop.(none)>
 Thomas Hellstrom <thellstrom@vmware.com> Thomas <thellstrom@vmware.com>
 Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellstrom <thellstrom-at-vmware-dot-com>
 Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellstrom <thomas-at-tungstengraphics-dot-com>
 Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellstrom <thomas@tungstengraphics.com>
 Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellström <thomas@tungstengraphics.com>
 Thomas Tanner <tanner@gmx.net> tanner <tanner>
 Tilman Sauerbeck <tilman@code-monkey.de> <tilman@freedesktop.org>
 Timothy Arceri <timothy.arceri@collabora.com> <t_arceri@yahoo.com.au>
 Timothy Arceri <timothy.arceri@collabora.com> Timothy <t_arceri@yahoo.com.au>
 Tom Fogal <tfogal@alumni.unh.edu> <tfogal@sci.utah.edu>
 Tom Stellard <thomas.stellard@amd.com> <tstellar@gmail.com>
 Tom Stellard <thomas.stellard@amd.com> Thomas Stellard <tom.stellard@amd.com>
 Tormod Volden <debian.tormod@gmail.com> <lists.tormod@gmail.com>
 Török Edwin <edwin+mesa@etorok.net> Török Edvin <edwintorok@gmail.com>
 Török Edwin <edwin+mesa@etorok.net> <edwintorok@gmail.com>
 Ville Syrjälä <ville.syrjala@linux.intel.com> Ville Syrjala <syrjala@freedesktop.org>
 Ville Syrjälä <ville.syrjala@linux.intel.com> Ville Syrjala <syrjala@sci.fi>
 Vincent Lejeune <vljn@ovi.com> <peluche.canard@gmail.com>
 Vinson Lee <vlee@freedesktop.org> <vlee@vmware.com>
 Zhenyu Wang <zhenyuw@linux.intel.com> Wang Zhenyu <zhenyu.z.wang@intel.com>
 Zack Rusin <zackr@vmware.com> <zack@kde.org>
 Zack Rusin <zackr@vmware.com> <zack@pixel.(none)>
 Zack Rusin <zackr@vmware.com> <zack@tungstengraphics.com>
 Zhang <zxpmyth@yahoo.com.cn> zhang <zxpmyth@yahoo.com.cn>

									
										28

.travis.yml
									
												View File
												
				@@ -1,6 +1,7 @@

				language: c

				sudo: false

				sudo: true

				dist: trusty

				cache:

				  directories:

				@@ -15,7 +16,11 @@ addons:

				      - libexpat1-dev

				      - libxcb-dri2-0-dev

				      - libx11-xcb-dev

				      - llvm-3.4-dev

				      - llvm-3.5-dev

				      # llvm-config is not in the dev package?

				      - llvm-3.5

				      # LLVM packaging is broken and misses this dep.

				      - libedit-dev

				      - scons

				env:

				@@ -41,6 +46,16 @@ install:

				  - export PATH="/usr/lib/ccache:$PATH"

				  - pip install --user mako

				  # Since libdrm gets updated in configure.ac regularly, try to pick up the

				  # latest version from there.

				  - for line in `grep "^LIBDRM_.*_REQUIRED=" configure.ac`; do

				      old_ver=`echo $LIBDRM_VERSION | sed 's/libdrm-//'`;

				      new_ver=`echo $line | sed 's/.*REQUIRED=//'`;

				      if `echo "$old_ver,$new_ver" | tr ',' '\n' | sort -Vc 2> /dev/null`; then

				        export LIBDRM_VERSION="libdrm-$new_ver";

				      fi;

				    done

				  # Install dependencies where we require specific versions (or where

				  # disallowed by Travis CI's package whitelisting).

				@@ -78,22 +93,19 @@ install:

				  - wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2

				  - tar -jxvf $LIBDRM_VERSION.tar.bz2

				  - (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 && make install)

				  - wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2

				  - tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2

				  - (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)

				# Disabled LLVM (and therefore r300 and r600) because the build fails

				# with "undefined reference to `clock_gettime'" and "undefined

				# reference to `setupterm'" in llvmpipe.

				script:

				  - if test "x$BUILD" = xmake; then

				      ./autogen.sh --enable-debug

				        --disable-gallium-llvm

				        --with-egl-platforms=x11,drm

				        --with-dri-drivers=i915,i965,radeon,r200,swrast,nouveau

				        --with-gallium-drivers=svga,swrast,vc4,virgl

				        --with-gallium-drivers=svga,swrast,vc4,virgl,r300,r600

				        --disable-llvm-shared-libs

				        ;

				      make && make check;

				    elif test x$BUILD = xscons; then

									
										13

Android.common.mk
									
												View File
												
				@@ -33,6 +33,7 @@ MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)

				# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)

				LOCAL_CFLAGS += \

					-Wno-unused-parameter \

					-Wno-date-time \

					-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \

					-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \

					-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)

				@@ -53,6 +54,7 @@ LOCAL_CFLAGS += \

					-DHAVE___BUILTIN_CLZLL \

					-DHAVE___BUILTIN_UNREACHABLE \

					-DHAVE_PTHREAD=1 \

					-DHAVE_DLOPEN \

					-fvisibility=hidden \

					-Wno-sign-compare

				@@ -64,7 +66,6 @@ ifeq ($(strip $(MESA_ENABLE_ASM)),true)

				ifeq ($(TARGET_ARCH),x86)

				LOCAL_CFLAGS += \

					-DUSE_X86_ASM \

					-DHAVE_DLOPEN \

				endif

				endif

				@@ -82,9 +83,19 @@ LOCAL_CPPFLAGS += \

					-Wno-error=non-virtual-dtor \

					-Wno-non-virtual-dtor

				ifeq ($(MESA_LOLLIPOP_BUILD),true)

				  LOCAL_CFLAGS_32 += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"

				  LOCAL_CFLAGS_64 += -DDEFAULT_DRIVER_DIR=\"/system/lib64/$(MESA_DRI_MODULE_REL_PATH)\"

				else

				  LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"

				endif

				# uncomment to keep the debug symbols

				#LOCAL_STRIP_MODULE := false

				ifeq ($(strip $(LOCAL_MODULE_TAGS)),)

				LOCAL_MODULE_TAGS := optional

				endif

				# Quiet down the build system and remove any .h files from the sources

				LOCAL_SRC_FILES := $(patsubst %.h, , $(LOCAL_SRC_FILES))

									
										14

Android.mk
									
												View File
												
				@@ -42,6 +42,10 @@ $(call local-intermediates-dir)

				endef

				endif

				MESA_DRI_MODULE_REL_PATH := dri

				MESA_DRI_MODULE_PATH := $(TARGET_OUT_SHARED_LIBRARIES)/$(MESA_DRI_MODULE_REL_PATH)

				MESA_DRI_MODULE_UNSTRIPPED_PATH := $(TARGET_OUT_SHARED_LIBRARIES_UNSTRIPPED)/$(MESA_DRI_MODULE_REL_PATH)

				MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk

				MESA_PYTHON2 := python

				@@ -84,19 +88,23 @@ MESA_ENABLE_LLVM := $(if $(filter radeonsi,$(MESA_GPU_DRIVERS)),true,false)

				ifneq ($(strip $(MESA_GPU_DRIVERS)),)

				SUBDIRS := \

					src/gbm \

					src/loader \

					src/mapi \

					src/compiler \

					src/glsl \

					src/mesa \

					src/util \

					src/egl \

					src/intel/genxml \

					src/intel/isl \

					src/mesa/drivers/dri

				INC_DIRS := $(call all-named-subdir-makefiles,$(SUBDIRS))

				ifeq ($(strip $(MESA_BUILD_GALLIUM)),true)

				SUBDIRS += src/gallium

				INC_DIRS += $(call all-named-subdir-makefiles,src/gallium)

				endif

				include $(call all-named-subdir-makefiles,$(SUBDIRS))

				include $(INC_DIRS)

				endif

									
										16

Makefile.am
									
												View File
												
				@@ -22,20 +22,29 @@

				SUBDIRS = src

				AM_DISTCHECK_CONFIGURE_FLAGS = \

					--enable-dri \

					--enable-dri3 \

					--enable-egl \

					--enable-gallium-tests \

					--enable-gallium-osmesa \

					--enable-gallium-llvm \

					--enable-gbm \

					--enable-gles1 \

					--enable-gles2 \

					--enable-glx \

					--enable-glx-tls \

					--enable-nine \

					--enable-opencl \

					--enable-opengl \

					--enable-va \

					--enable-vdpau \

					--enable-xa \

					--enable-xvmc \

					--disable-llvm-shared-libs \

					--with-egl-platforms=x11,wayland,drm \

					--enable-llvm-shared-libs \

					--with-egl-platforms=x11,wayland,drm,surfaceless \

					--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \

					--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast

					--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr \

					--with-vulkan-drivers=intel

				ACLOCAL_AMFLAGS = -I m4

				@@ -53,6 +62,7 @@ noinst_HEADERS = \

					include/c99_math.h \

					include/c11 \

					include/D3D9 \

					include/GL/wglext.h \

					include/HaikuGL \

					include/no_extern_c.h \

					include/pci_ids

106

REVIEWERS Normal file

View File

@@ -0,0 +1,106 @@
 Overview:
 	This file is similar in syntax (or more precisly a subset) of what is
 	used by the MAINTAINERS file in the linux kernel.  Some fields do not
 	apply, for example, in all cases, send patches to:
 		mesa-dev@lists.freedesktop.org
 	and in all cases the patchwork instance is:
 		https://patchwork.freedesktop.org/project/mesa/
 	The purpose is not exactly the same the MAINTAINERS file in the linux
 	kernel, as there are not official/formal maintainers of different
 	subsystems in mesa, but is meant to give an idea of who to CC for
 	various patches for review, and to allow the use of
 	scripts/get_reviewer.pl as git --cc-cmd.
 Usage:
 	When sending patches:
 		git send-email --cc-cmd ./scripts/get_reviewer.pl ...
 	Or to configure as default:
 		git config sendemail.cccmd ./scripts/get_reviewer.pl
 Descriptions of section entries:
 	R: Designated reviewer: FullName <address@domain>
 	   These reviewers should be CCed on patches.
 	F: Files and directories with wildcard patterns.
 	   A trailing slash includes all files and subdirectory files.
 	   F:	drivers/net/	all files in and below drivers/net
 	   F:	drivers/net/*	all files in drivers/net, but not below
 	   F:	*/net/*		all files in "any top level directory"/net
 	   One pattern per line.  Multiple F: lines acceptable.
 	N: Files and directories with regex patterns.
 	   N:	[^a-z]tegra	all files whose path contains the word tegra
 	   One pattern per line.  Multiple N: lines acceptable.
 	   scripts/get_maintainer.pl has different behavior for files that
 	   match F: pattern and matches of N: patterns.  By default,
 	   get_maintainer will not look at git log history when an F: pattern
 	   match occurs.  When an N: match occurs, git log history is used
 	   to also notify the people that have git commit signatures.
 Maintainers List (try to look for most precise areas first)
 Note: this is an opt-in system, I have not tried to add anyone who hasn't
 either asked me or sent a patch to add themselves.
 		-----------------------------------
 NIR
 R:	Jason Ekstrand <jason@jlekstrand.net>
 F:	src/compiler/nir/
 DOCUMENTATION
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: docs/
 F: doxygen/
 COMPATIBILITY HEADERS
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: include/c99*
 DRI LOADER
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: src/loader/
 GALLIUM LOADER
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: src/gallium/auxiliary/pipe-loader/
 F: src/gallium/auxiliary/target-helpers/
 GALLIUM TARGETS
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: src/gallium/targets/
 AUTOCONF BUILD
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: configure.ac
 F: */Automake.inc
 F: */Makefile.*am
 F: */Makefile.sources
 SCONS BUILD
 F: scons/
 F: */SConscript*
 F: */Makefile.sources
 ANDROID BUILD
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: CleanSpec.mk
 F: */Android.*mk
 F: */Makefile.sources
 WAYLAND EGL SUPPORT
 R: Daniel Stone <daniels@collabora.com>
 F: src/egl/wayland/*
 F: src/egl/drivers/dri2/platform_wayland.c
 FREEDRENO
 R:	Rob Clark <robclark@freedesktop.org>
 F:	src/gallium/drivers/freedreno/

									
										19

SConstruct
									
												View File
												
				@@ -1,7 +1,7 @@

				#######################################################################

				# Top-level SConstruct

				#

				# For example, invoke scons as 

				# For example, invoke scons as

				#

				#   scons build=debug llvm=yes machine=x86

				#

				@@ -12,13 +12,13 @@

				#   build='debug'

				#   llvm=True

				#   machine='x86'

				# 

				#

				# Invoke

				#

				#   scons -h

				#

				# to get the full list of options. See scons manpage for more info.

				#  

				#

				import os

				import os.path

				@@ -36,7 +36,7 @@ common.AddOptions(opts)

				env = Environment(

					options = opts,

					tools = ['gallium'],

					toolpath = ['#scons'],	

					toolpath = ['#scons'],

					ENV = os.environ,

				)

				@@ -53,7 +53,7 @@ else:

				    print 'scons: warning: targets option is deprecated; pass the targets on their own such as'

				    print

				    print '  scons %s' % ' '.join(targets)

				    print 

				    print

				    COMMAND_LINE_TARGETS.append(targets)

				@@ -84,9 +84,14 @@ env.Append(CPPPATH = [

				#print env.Dump()

				# Add a check target for running tests

				check = env.Alias('check')

				env.AlwaysBuild(check)

				#######################################################################

				# Invoke host SConscripts 

				# 

				# Invoke host SConscripts

				#

				# For things that are meant to be run on the native host build machine, instead

				# of the target machine.

				#

2

VERSION

View File

@@ -1 +1 @@
 .2.0-devel
 .0.6

									
										9

appveyor.yml
									
												View File
												
				@@ -37,6 +37,8 @@ cache:

				- win_flex_bison-2.4.5.zip

				- llvm-3.3.1-msvc2013-mtd.7z

				os: Visual Studio 2013

				environment:

				  WINFLEXBISON_ARCHIVE: win_flex_bison-2.4.5.zip

				  LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z

				@@ -47,11 +49,13 @@ install:

				- python -m pip --version

				# Install Mako

				- python -m pip install --egg Mako

				# Install pywin32 extensions, needed by SCons

				- python -m pip install pypiwin32

				# Install SCons

				- python -m pip install --egg scons==2.4.1

				- scons --version

				# Install flex/bison

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "http://downloads.sourceforge.net/project/winflexbison/%WINFLEXBISON_ARCHIVE%"

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://downloads.sourceforge.net/project/winflexbison/old_versions/%WINFLEXBISON_ARCHIVE%"

				- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul

				- set Path=%CD%\winflexbison;%Path%

				- win_flex --version

				@@ -65,6 +69,9 @@ install:

				build_script:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1

				after_build:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1 check

				# It's possible to setup notification here, as described in

				# http://www.appveyor.com/docs/notifications#appveyor-yml-configuration , but

28

bin/.cherry-ignore Normal file

View File

@@ -0,0 +1,28 @@
 # The offending commit that this patch (part) reverts isn't in 12.0
 be32a2132785fbc119f17e62070e007ee7d17af7 i965/compiler: Bring back the INTEL_PRECISE_TRIG environment variable
 # The patch depends on the batch_cache work at least.
 f00f749fda4c1beca38f362c7f86bdc6e32785 a4xx: make sure to actually clamp depth as requested
 # The patch depends on the 'generic' interoplation and location
 # implementation introduced with 2d6dd30a9b30
 b22beafb2d07006b197c62d717fc7f80cc i965/fs: Use sample interpolation for interpolateAtCentroid in persample mode
 # VAAPI encode landed after the branch point.
 a5993022275c20061ac025d9adc26c5f9d02afee st/va Avoid VBR bitrate calculation overflow v2
 # EGL_KHR_debug landed after the branch point.
 b6f9340f798111e53e08f5d35c7630cee48 egl: Fix missing unlock in eglGetSyncAttribKHR
 # Depends on update_renderbuffer_read_surfaces at least
 f2b9b0c730e345bcffa9eadabb25af3ab02642f2 i965: Add missing BRW_NEW_FS_PROG_DATA to render target reads.
 # The commit in question hasn't landed in branch
 ef787339774bc7f1cc9c1615722f944005e070c Revert "egl/android: Set EGL_MAX_PBUFFER_WIDTH and EGL_MAX_PBUFFER_HEIGHT"
 # Patches depend on the fence_finish() gallium API change and corresponding driver work
 f240ad98bc05281ea7013d91973cb5f932ae9434 st/mesa: unduplicate st_check_sync code
 b687f766fddb7b39479cd9ee0427984029ea3559 st/mesa: allow multiple concurrent waiters in ClientWaitSync
 # Commit was reverted shortly after it landed in master
 a39ad185932eab4f25a0cb2b112c10d8700ef242 configure.ac: honour LLVM_LIBDIR when linking against LLVM

									
										2

bin/bugzilla_mesa.sh
									
												View File
												
				@@ -40,7 +40,7 @@ else

					for i in $urls

					do

						id=$(echo $i | cut -d'=' -f2)

						summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>Bug [0-9]\+ &ndash; \(.*\)<\/title>/\1/')

						summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>[0-9]\+ &ndash; \(.*\)<\/title>/\1/')

						echo "<li><a href=\"$i\">Bug $id</a> - $summary</li>"

						echo ""

					done

									
										35

bin/get-extra-pick-list.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,35 @@

				#!/bin/sh

				# Script for generating a list of candidates which fix commits that have been

				# previously cherry-picked to a stable branch.

				#

				# Usage examples:

				#

				# $ bin/get-extra-pick-list.sh

				# $ bin/get-extra-pick-list.sh > picklist

				# $ bin/get-extra-pick-list.sh | tee picklist

				# Use the last branchpoint as our limit for the search

				# XXX: there should be a better way for this

				latest_branchpoint=`git branch | grep \* | cut -c 3-`-branchpoint

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' |\

					cut -c -8 |\

				while read sha

				do

					# Check if the original commit is referenced in master

					git log -n1 --pretty=oneline --grep=$sha $latest_branchpoint..origin/master |\

						cut -c -8 |\

					while read candidate

					do

						# Check if the potential fix, hasn't landed in branch yet.

						found=`git log -n1 --pretty=oneline --reverse --grep=$candidate $latest_branchpoint..HEAD |wc -l`

						if test $found = 0

						then

							echo Commit $candidate might need to be picked, as it references $sha

						fi

					done

				done

									
										2

bin/get-pick-list.sh
									
												View File
												
				@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

				# Grep for commits that were marked as a candidate for the stable tree.

				git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\

				git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*12\.0.*mesa-stable\)' HEAD..origin/master |\

				while read sha

				do

					# Check to see whether the patch is on the ignore list.

									
										39

bin/get-typod-pick-list.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,39 @@

				#!/bin/sh

				# Script for generating a list of candidates which have typos in the nomination line

				#

				# Usage examples:

				#

				# $ bin/get-typod-pick-list.sh

				# $ bin/get-typod-pick-list.sh > picklist

				# $ bin/get-typod-pick-list.sh | tee picklist

				# NB:

				# This script intentionally _never_ checks for specific version tag

				# Should we consider folding it with the original get-pick-list.sh

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

				# Grep for commits that were marked as a candidate for the stable tree.

				git log --reverse --pretty=%H -i --grep='^CC:.*mesa-dev' HEAD..origin/master |\

				while read sha

				do

					# Check to see whether the patch is on the ignore list.

					if [ -f bin/.cherry-ignore ] ; then

						if grep -q ^$sha bin/.cherry-ignore ; then

							continue

						fi

					fi

					# Check to see if it has already been picked over.

					if grep -q ^$sha already_picked ; then

						continue

					fi

					git log -n1 --pretty=oneline $sha | cat

				done

				rm -f already_picked

									
										1

common.py
									
												View File
												
				@@ -97,6 +97,7 @@ def AddOptions(opts):

				    opts.Add(BoolOption('embedded', 'embedded build', 'no'))

				    opts.Add(BoolOption('analyze',

				                        'enable static code analysis where available', 'no'))

				    opts.Add(BoolOption('asan', 'enable Address Sanitizer', 'no'))

				    opts.Add('toolchain', 'compiler toolchain', default_toolchain)

				    opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support',

				                        'no'))

414

configure.ac

View File

@@ -68,7 +68,7 @@ OPENCL_VERSION=1
 AC_SUBST([OPENCL_VERSION])
 dnl Versions for external dependencies
 LIBDRM_REQUIRED=2.4.60
 LIBDRM_REQUIRED=2.4.66
 LIBDRM_RADEON_REQUIRED=2.4.56
 LIBDRM_AMDGPU_REQUIRED=2.4.63
 LIBDRM_INTEL_REQUIRED=2.4.61
@@ -110,10 +110,10 @@ LT_INIT([disable-static])
 AC_CHECK_PROG(RM, rm, [rm -f])
 AX_PROG_BISON([],
               AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"],
               AS_IF([test ! -f "$srcdir/src/compiler/glsl/glcpp/glcpp-parse.c"],
                     [AC_MSG_ERROR([bison not found - unable to compile glcpp-parse.y])]))
 AX_PROG_FLEX([],
              AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-lex.c"],
              AS_IF([test ! -f "$srcdir/src/compiler/glsl/glcpp/glcpp-lex.c"],
                    [AC_MSG_ERROR([flex not found - unable to compile glcpp-lex.l])]))
 AC_CHECK_PROG(INDENT, indent, indent, cat)
@@ -223,8 +223,11 @@ AX_GCC_FUNC_ATTRIBUTE([format])
 AX_GCC_FUNC_ATTRIBUTE([malloc])
 AX_GCC_FUNC_ATTRIBUTE([packed])
 AX_GCC_FUNC_ATTRIBUTE([pure])
 AX_GCC_FUNC_ATTRIBUTE([returns_nonnull])
 AX_GCC_FUNC_ATTRIBUTE([unused])
 AX_GCC_FUNC_ATTRIBUTE([visibility])
 AX_GCC_FUNC_ATTRIBUTE([warn_unused_result])
 AX_GCC_FUNC_ATTRIBUTE([weak])
 AM_CONDITIONAL([GEN_ASM_OFFSETS], test "x$GEN_ASM_OFFSETS" = xyes)
@@ -247,7 +250,11 @@ _SAVE_CPPFLAGS="$CPPFLAGS"
 dnl Compiler macros
 DEFINES="-D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS"
 AC_SUBST([DEFINES])
 android=no
 case "$host_os" in
 *-android)
     android=yes
     ;;
 linux*|*-gnu*|gnu*)
     DEFINES="$DEFINES -D_GNU_SOURCE"
     ;;
@@ -259,6 +266,8 @@ cygwin*)
     ;;
 esac
 AM_CONDITIONAL(HAVE_ANDROID, test "x$android" = xyes)
 dnl Add flags for gcc and g++
 if test "x$GCC" = xyes; then
     CFLAGS="$CFLAGS -Wall"
@@ -513,6 +522,8 @@ else
    DEFINES="$DEFINES -DNDEBUG"
 fi
 DEFAULT_GL_LIB_NAME=GL
 dnl
 dnl Check if linker supports -Bsymbolic
 dnl
@@ -610,6 +621,23 @@ esac
 AM_CONDITIONAL(HAVE_COMPAT_SYMLINKS, test "x$HAVE_COMPAT_SYMLINKS" = xyes)
 DEFAULT_GL_LIB_NAME=GL
 dnl
 dnl Libglvnd configuration
 dnl
 AC_ARG_ENABLE([libglvnd],
     [AS_HELP_STRING([--enable-libglvnd],
         [Build for libglvnd @<:@default=disabled@:>@])],
     [enable_libglvnd="$enableval"],
     [enable_libglvnd=no])
 AM_CONDITIONAL(USE_LIBGLVND_GLX, test "x$enable_libglvnd" = xyes)
 #AM_COND_IF([USE_LIBGLVND_GLX], [DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1"])
 if test "x$enable_libglvnd" = xyes ; then
     DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1"
     DEFAULT_GL_LIB_NAME=GLX_mesa
 fi
 dnl
 dnl library names
 dnl
@@ -647,13 +675,13 @@ AC_ARG_WITH([gl-lib-name],
   [AS_HELP_STRING([--with-gl-lib-name@<:@=NAME@:>@],
     [specify GL library name @<:@default=GL@:>@])],
   [GL_LIB=$withval],
   [GL_LIB=GL])
   [GL_LIB="$DEFAULT_GL_LIB_NAME"])
 AC_ARG_WITH([osmesa-lib-name],
   [AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@],
     [specify OSMesa library name @<:@default=OSMesa@:>@])],
   [OSMESA_LIB=$withval],
   [OSMESA_LIB=OSMesa])
 AS_IF([test "x$GL_LIB" = xyes], [GL_LIB=GL])
 AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"])
 AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa])
 dnl
@@ -704,8 +732,10 @@ test "x$enable_asm" = xno && AC_MSG_RESULT([no])
 if test "x$enable_asm" = xyes -a "x$cross_compiling" = xyes; then
     case "$host_cpu" in
     i?86 | x86_64 | amd64)
         enable_asm=no
         AC_MSG_RESULT([no, cross compiling])
         if test "x$host_cpu" != "x$target_cpu"; then
             enable_asm=no
             AC_MSG_RESULT([no, cross compiling])
         fi
         ;;
     esac
 fi
@@ -754,6 +784,7 @@ if test "x$enable_asm" = xyes; then
     esac
 fi
 AC_HEADER_MAJOR
 AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"])
 AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
 AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
@@ -796,6 +827,10 @@ dnl to -pthread, which causes problems if we need -lpthread to appear in
 dnl pkgconfig files.
 test -z "$PTHREAD_LIBS" && PTHREAD_LIBS="-lpthread"
 PKG_CHECK_MODULES(PTHREADSTUBS, pthread-stubs)
 AC_SUBST(PTHREADSTUBS_CFLAGS)
 AC_SUBST(PTHREADSTUBS_LIBS)
 dnl SELinux awareness.
 AC_ARG_ENABLE([selinux],
     [AS_HELP_STRING([--enable-selinux],
@@ -856,8 +891,8 @@ AC_ARG_ENABLE([dri3],
     [enable_dri3="$enableval"],
     [enable_dri3="$dri3_default"])
 AC_ARG_ENABLE([glx],
     [AS_HELP_STRING([--enable-glx],
         [enable GLX library @<:@default=enabled@:>@])],
     [AS_HELP_STRING([--enable-glx@<:@=dri|xlib|gallium-xlib@:>@],
         [enable the GLX library and choose an implementation @<:@default=auto@:>@])],
     [enable_glx="$enableval"],
     [enable_glx=yes])
 AC_ARG_ENABLE([osmesa],
@@ -923,17 +958,6 @@ AC_ARG_ENABLE([opencl_icd],
            @<:@default=disabled@:>@])],
     [enable_opencl_icd="$enableval"],
     [enable_opencl_icd=no])
 AC_ARG_ENABLE([xlib-glx],
     [AS_HELP_STRING([--enable-xlib-glx],
         [make GLX library Xlib-based instead of DRI-based @<:@default=disabled@:>@])],
     [enable_xlib_glx="$enableval"],
     [enable_xlib_glx=no])
 AC_ARG_ENABLE([r600-llvm-compiler],
     [AS_HELP_STRING([--enable-r600-llvm-compiler],
         [Enable experimental LLVM backend for graphics shaders @<:@default=disabled@:>@])],
     [enable_r600_llvm="$enableval"],
     [enable_r600_llvm=no])
 AC_ARG_ENABLE([gallium-tests],
     [AS_HELP_STRING([--enable-gallium-tests],
@@ -992,36 +1016,86 @@ AM_CONDITIONAL(NEED_OPENGL_COMMON, test "x$enable_opengl" = xyes -o \
                                         "x$enable_gles1" = xyes -o \
                                         "x$enable_gles2" = xyes)
 if test "x$enable_glx" = xno; then
     AC_MSG_WARN([GLX disabled, disabling Xlib-GLX])
     enable_xlib_glx=no
 # Validate GLX options
 if test "x$enable_glx" = xyes; then
     if test "x$enable_dri" = xyes; then
         enable_glx=dri
     elif test -n "$with_gallium_drivers"; then
         enable_glx=gallium-xlib
     else
         enable_glx=xlib
     fi
 fi
 case "x$enable_glx" in
 xdri | xxlib | xgallium-xlib)
     # GLX requires OpenGL
     if test "x$enable_opengl" = xno; then
         AC_MSG_ERROR([GLX cannot be built without OpenGL])
     fi
 if test "x$enable_dri$enable_xlib_glx" = xyesyes; then
     AC_MSG_ERROR([DRI and Xlib-GLX cannot be built together])
     # Check individual dependencies
     case "x$enable_glx" in
     xdri)
         if test "x$enable_dri" = xno; then
             AC_MSG_ERROR([DRI-based GLX requires DRI to be enabled])
         fi
         ;;
     xxlib)
         if test "x$enable_dri" = xyes; then
             AC_MSG_ERROR([Xlib-based GLX cannot be built with DRI enabled])
         fi
         ;;
     xgallium-xlib )
         if test "x$enable_dri" = xyes; then
             AC_MSG_ERROR([Xlib-based (Gallium) GLX cannot be built with DRI enabled])
         fi
         if test -z "$with_gallium_drivers"; then
             AC_MSG_ERROR([Xlib-based (Gallium) GLX cannot be built without Gallium enabled])
         fi
         ;;
     esac
     ;;
 xno)
     ;;
 *)
     AC_MSG_ERROR([Illegal value for --enable-glx: $enable_glx])
     ;;
 esac
 AM_CONDITIONAL(HAVE_GLX, test "x$enable_glx" != xno)
 AM_CONDITIONAL(HAVE_DRI_GLX, test "x$enable_glx" = xdri)
 AM_CONDITIONAL(HAVE_XLIB_GLX, test "x$enable_glx" = xxlib)
 AM_CONDITIONAL(HAVE_GALLIUM_XLIB_GLX, test "x$enable_glx" = xgallium-xlib)
 dnl
 dnl Libglvnd configuration
 dnl
 AC_ARG_ENABLE([libglvnd],
     [AS_HELP_STRING([--enable-libglvnd],
         [Build for libglvnd @<:@default=disabled@:>@])],
     [enable_libglvnd="$enableval"],
     [enable_libglvnd=no])
 AM_CONDITIONAL(USE_LIBGLVND_GLX, test "x$enable_libglvnd" = xyes)
 if test "x$enable_libglvnd" = xyes ; then
     dnl XXX: update once we can handle more than libGL/glx.
     dnl Namely: we should error out if neither of the glvnd enabled libraries
     dnl are built
     case "x$enable_glx" in
     xno)
         AC_MSG_ERROR([cannot build libglvnd without GLX])
         ;;
     xxlib | xgallium-xlib )
         AC_MSG_ERROR([cannot build libgvnd when Xlib-GLX or Gallium-Xlib-GLX is enabled])
         ;;
     xdri)
         ;;
     esac
     PKG_CHECK_MODULES([GLVND], libglvnd >= 0.1.0)
     DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1"
     DEFAULT_GL_LIB_NAME=GLX_mesa
 fi
 if test "x$enable_opengl$enable_xlib_glx" = xnoyes; then
     AC_MSG_ERROR([Xlib-GLX cannot be built without OpenGL])
 fi
 # Disable GLX if OpenGL is not enabled
 if test "x$enable_glx$enable_opengl" = xyesno; then
     AC_MSG_WARN([OpenGL not enabled, disabling GLX])
     enable_glx=no
 fi
 # Disable GLX if DRI and Xlib-GLX are not enabled
 if test "x$enable_glx" = xyes -a \
         "x$enable_dri" = xno -a \
         "x$enable_xlib_glx" = xno; then
     AC_MSG_WARN([Neither DRI nor Xlib-GLX enabled, disabling GLX])
     enable_glx=no
 fi
 AM_CONDITIONAL(HAVE_DRI_GLX, test "x$enable_glx" = xyes -a \
                                   "x$enable_dri" = xyes)
 # Check for libdrm
 PKG_CHECK_MODULES([LIBDRM], [libdrm >= $LIBDRM_REQUIRED],
                   [have_libdrm=yes], [have_libdrm=no])
@@ -1076,10 +1150,6 @@ dnl
 dnl Driver specific build directories
 dnl
 if test -n "$with_gallium_drivers" -a "x$enable_glx$enable_xlib_glx" = xyesyes; then
     NEED_WINSYS_XLIB="yes"
 fi
 if test "x$enable_gallium_osmesa" = xyes; then
     if ! echo "$with_gallium_drivers" | grep -q 'swrast'; then
         AC_MSG_ERROR([gallium_osmesa requires the gallium swrast driver])
@@ -1272,8 +1342,8 @@ AC_ARG_ENABLE([driglx-direct],
 dnl
 dnl libGL configuration per driver
 dnl
 case "x$enable_glx$enable_xlib_glx" in
 xyesyes)
 case "x$enable_glx" in
 xxlib | xgallium-xlib)
     # Xlib-based GLX
     dri_modules="x11 xext xcb"
     PKG_CHECK_MODULES([XLIBGL], [$dri_modules])
@@ -1283,7 +1353,7 @@ xyesyes)
     GL_LIB_DEPS="$GL_LIB_DEPS $SELINUX_LIBS -lm $PTHREAD_LIBS $DLOPEN_LIBS"
     GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $SELINUX_LIBS -lm $PTHREAD_LIBS"
     ;;
 xyesno)
 xdri)
     # DRI-based GLX
     PKG_CHECK_MODULES([GLPROTO], [glproto >= $GLPROTO_REQUIRED])
@@ -1310,7 +1380,7 @@ xyesno)
             if test x"$enable_dri3" = xyes; then
                PKG_CHECK_EXISTS([xcb >= $XCB_REQUIRED], [], AC_MSG_ERROR([DRI3 requires xcb >= $XCB_REQUIRED]))
                dri3_modules="xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
                dri3_modules="xcb xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
                PKG_CHECK_MODULES([XCB_DRI3], [$dri3_modules])
             fi
         fi
@@ -1372,11 +1442,11 @@ AC_SUBST([HAVE_XF86VIDMODE])
 dnl
 dnl More GLX setup
 dnl
 case "x$enable_glx$enable_xlib_glx" in
 xyesyes)
 case "x$enable_glx" in
 xxlib | xgallium-xlib)
     DEFINES="$DEFINES -DUSE_XSHM"
     ;;
 xyesno)
 xdri)
     DEFINES="$DEFINES -DGLX_INDIRECT_RENDERING"
     if test "x$driglx_direct" = xyes; then
         DEFINES="$DEFINES -DGLX_DIRECT_RENDERING"
@@ -1549,8 +1619,58 @@ if test -n "$with_dri_drivers"; then
     DRI_DIRS=`echo $DRI_DIRS|tr " " "\n"|sort -u|tr "\n" " "`
 fi
 #
 # Vulkan driver configuration
 #
 AC_ARG_WITH([vulkan-drivers],
     [AS_HELP_STRING([--with-vulkan-drivers@<:@=DIRS...@:>@],
         [comma delimited Vulkan drivers list, e.g.
         "intel"
         @<:@default=no@:>@])],
     [with_vulkan_drivers="$withval"],
     [with_vulkan_drivers="no"])
 # Doing '--without-vulkan-drivers' will set this variable to 'no'.  Clear it
 # here so that the script doesn't choke on an unknown driver name later.
 case "x$with_vulkan_drivers" in
     xyes) with_vulkan_drivers="$VULKAN_DRIVERS_DEFAULT" ;;
     xno) with_vulkan_drivers='' ;;
 esac
 AC_ARG_WITH([vulkan-icddir],
     [AS_HELP_STRING([--with-vulkan-icddir=DIR],
         [directory for the Vulkan driver icd files @<:@${datarootdir}/vulkan/icd.d@:>@])],
     [VULKAN_ICD_INSTALL_DIR="$withval"],
     [VULKAN_ICD_INSTALL_DIR='${datarootdir}/vulkan/icd.d'])
 AC_SUBST([VULKAN_ICD_INSTALL_DIR])
 if test -n "$with_vulkan_drivers"; then
     VULKAN_DRIVERS=`IFS=', '; echo $with_vulkan_drivers`
     for driver in $VULKAN_DRIVERS; do
         case "x$driver" in
         xintel)
             if test "x$HAVE_I965_DRI" != xyes; then
                 AC_MSG_ERROR([Intel Vulkan driver requires the i965 dri driver])
             fi
             if test "x$with_sha1" == "x"; then
                 AC_MSG_ERROR([Intel Vulkan driver requires SHA1])
             fi
             HAVE_INTEL_VULKAN=yes;
             ;;
         *)
             AC_MSG_ERROR([Vulkan driver '$driver' does not exist])
             ;;
         esac
     done
     VULKAN_DRIVERS=`echo $VULKAN_DRIVERS|tr " " "\n"|sort -u|tr "\n" " "`
 fi
 AM_CONDITIONAL(NEED_MEGADRIVER, test -n "$DRI_DIRS")
 AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_xlib_glx" = xyes -o \
 AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_glx" = xxlib -o \
                                   "x$enable_osmesa" = xyes -o \
                                   -n "$DRI_DIRS")
@@ -1565,7 +1685,7 @@ AC_ARG_WITH([osmesa-bits],
     [osmesa_bits="$withval"],
     [osmesa_bits=8])
 if test "x$osmesa_bits" != x8; then
     if test "x$enable_dri" = xyes -o "x$enable_glx" = xyes; then
     if test "x$enable_dri" = xyes -o "x$enable_glx" != xno; then
         AC_MSG_WARN([Ignoring OSMesa channel bits because of non-OSMesa driver])
         osmesa_bits=8
     fi
@@ -1721,7 +1841,12 @@ if test "x$enable_xvmc" = xyes -o \
         "x$enable_vdpau" = xyes -o \
         "x$enable_omx" = xyes -o \
         "x$enable_va" = xyes; then
     PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
     if test x"$enable_dri3" = xyes; then
         PKG_CHECK_MODULES([VL], [xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED
                                  x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
     else
         PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
     fi
     need_gallium_vl_winsys=yes
 fi
 AM_CONDITIONAL(NEED_GALLIUM_VL_WINSYS, test "x$need_gallium_vl_winsys" = xyes)
@@ -1735,6 +1860,7 @@ AM_CONDITIONAL(HAVE_ST_XVMC, test "x$enable_xvmc" = xyes)
 if test "x$enable_vdpau" = xyes; then
     PKG_CHECK_MODULES([VDPAU], [vdpau >= $VDPAU_REQUIRED])
     gallium_st="$gallium_st vdpau"
     DEFINES="$DEFINES -DHAVE_ST_VDPAU"
 fi
 AM_CONDITIONAL(HAVE_ST_VDPAU, test "x$enable_vdpau" = xyes)
@@ -1873,8 +1999,8 @@ if test "x$with_egl_platforms" != "x" -a "x$enable_egl" != xyes; then
     AC_MSG_ERROR([cannot build egl state tracker without EGL library])
 fi
 PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland_scanner],
         WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland_scanner`,
 PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
         WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
         WAYLAND_SCANNER='')
 if test "x$WAYLAND_SCANNER" = x; then
     AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner])
@@ -1911,6 +2037,9 @@ for plat in $egl_platforms; do
 			AC_MSG_ERROR([EGL platform surfaceless requires libdrm >= $LIBDRM_REQUIRED])
 		;;
 	android)
 		;;
 	*)
 		AC_MSG_ERROR([EGL platform '$plat' does not exist])
 		;;
@@ -1931,11 +2060,11 @@ else
     EGL_NATIVE_PLATFORM="_EGL_INVALID_PLATFORM"
 fi
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_X11, echo "$egl_platforms" | grep -q 'x11')
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_WAYLAND, echo "$egl_platforms" | grep -q 'wayland')
 AM_CONDITIONAL(HAVE_PLATFORM_X11, echo "$egl_platforms" | grep -q 'x11')
 AM_CONDITIONAL(HAVE_PLATFORM_WAYLAND, echo "$egl_platforms" | grep -q 'wayland')
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_DRM, echo "$egl_platforms" | grep -q 'drm')
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_SURFACELESS, echo "$egl_platforms" | grep -q 'surfaceless')
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_NULL, echo "$egl_platforms" | grep -q 'null')
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_ANDROID, echo "$egl_platforms" | grep -q 'android')
 AM_CONDITIONAL(HAVE_EGL_DRIVER_DRI2, test "x$HAVE_EGL_DRIVER_DRI2" != "x")
@@ -1976,6 +2105,9 @@ AC_ARG_WITH([llvm-prefix],
 strip_unwanted_llvm_flags() {
     # Use \> (marks the end of the word)
     echo `$1` | sed \
 	-e 's/-march=\S*//g' \
 	-e 's/-mtune=\S*//g' \
 	-e 's/-mcpu=\S*//g' \
 	-e 's/-DNDEBUG\>//g' \
 	-e 's/-D_GNU_SOURCE\>//g' \
 	-e 's/-pedantic\>//g' \
@@ -2052,6 +2184,10 @@ if test "x$enable_gallium_llvm" = xyes; then
         LLVM_COMPONENTS="engine bitwriter mcjit mcdisassembler"
         if $LLVM_CONFIG --components | grep -q inteljitevents ; then
             LLVM_COMPONENTS="${LLVM_COMPONENTS} inteljitevents"
         fi
         if test "x$enable_opencl" = xyes; then
             llvm_check_version_for "3" "5" "0" "opencl"
@@ -2191,6 +2327,55 @@ radeon_llvm_check() {
     fi
 }
 swr_llvm_check() {
     gallium_require_llvm $1
     if test ${LLVM_VERSION_INT} -lt 306; then
         AC_MSG_ERROR([LLVM version 3.6 or later required when building $1])
     fi
     if test "x$enable_gallium_llvm" != "xyes"; then
         AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
     fi
 }
 swr_require_cxx_feature_flags() {
     feature_name="$1"
     preprocessor_test="$2"
     option_list="$3"
     output_var="$4"
     AC_MSG_CHECKING([whether $CXX supports $feature_name])
     AC_LANG_PUSH([C++])
     save_CXXFLAGS="$CXXFLAGS"
     save_IFS="$IFS"
     IFS=","
     found=0
     for opts in $option_list
     do
         unset IFS
         CXXFLAGS="$opts $save_CXXFLAGS"
         AC_COMPILE_IFELSE(
             [AC_LANG_PROGRAM(
                 [   #if !($preprocessor_test)
                     #error
                     #endif
                 ])],
             [found=1; break],
             [])
         IFS=","
     done
     IFS="$save_IFS"
     CXXFLAGS="$save_CXXFLAGS"
     AC_LANG_POP([C++])
     if test $found -eq 1; then
         AC_MSG_RESULT([$opts])
         eval "$output_var=\$opts"
         return 0
     fi
     AC_MSG_RESULT([no])
     AC_MSG_ERROR([swr requires $feature_name support])
     return 1
 }
 dnl Duplicates in GALLIUM_DRIVERS_DIRS are removed by sorting it after this block
 if test -n "$with_gallium_drivers"; then
     gallium_drivers=`IFS=', '; echo $with_gallium_drivers`
@@ -2225,14 +2410,8 @@ if test -n "$with_gallium_drivers"; then
             PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
             gallium_require_drm "Gallium R600"
             gallium_require_drm_loader
             if test "x$enable_r600_llvm" = xyes -o "x$enable_opencl" = xyes; then
                 radeon_llvm_check "r600g"
                 LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
             fi
             if test "x$enable_r600_llvm" = xyes; then
                 USE_R600_LLVM_COMPILER=yes;
             fi
             if test "x$enable_opencl" = xyes; then
                 radeon_llvm_check "r600g"
                 LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
             fi
             ;;
@@ -2263,6 +2442,26 @@ if test -n "$with_gallium_drivers"; then
                 HAVE_GALLIUM_LLVMPIPE=yes
             fi
             ;;
         xswr)
             swr_llvm_check "swr"
             swr_require_cxx_feature_flags "C++11" "__cplusplus >= 201103L" \
                 ",-std=c++11" \
                 SWR_CXX11_CXXFLAGS
             AC_SUBST([SWR_CXX11_CXXFLAGS])
             swr_require_cxx_feature_flags "AVX" "defined(__AVX__)" \
                 ",-mavx,-march=core-avx" \
                 SWR_AVX_CXXFLAGS
             AC_SUBST([SWR_AVX_CXXFLAGS])
             swr_require_cxx_feature_flags "AVX2" "defined(__AVX2__)" \
                 ",-mavx2 -mfma -mbmi2 -mf16c,-march=core-avx2" \
                 SWR_AVX2_CXXFLAGS
             AC_SUBST([SWR_AVX2_CXXFLAGS])
             HAVE_GALLIUM_SWR=yes
             ;;
         xvc4)
             HAVE_GALLIUM_VC4=yes
             gallium_require_drm "vc4"
@@ -2352,6 +2551,10 @@ AM_CONDITIONAL(HAVE_GALLIUM_NOUVEAU, test "x$HAVE_GALLIUM_NOUVEAU" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_FREEDRENO, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_SOFTPIPE, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_LLVMPIPE, test "x$HAVE_GALLIUM_LLVMPIPE" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_SWR, test "x$HAVE_GALLIUM_SWR" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_SWRAST, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes -o \
                                          "x$HAVE_GALLIUM_LLVMPIPE" = xyes -o \
                                          "x$HAVE_GALLIUM_SWR" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_VC4, test "x$HAVE_GALLIUM_VC4" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_VIRGL, test "x$HAVE_GALLIUM_VIRGL" = xyes)
@@ -2373,12 +2576,16 @@ AM_CONDITIONAL(HAVE_R200_DRI, test x$HAVE_R200_DRI = xyes)
 AM_CONDITIONAL(HAVE_RADEON_DRI, test x$HAVE_RADEON_DRI = xyes)
 AM_CONDITIONAL(HAVE_SWRAST_DRI, test x$HAVE_SWRAST_DRI = xyes)
 AM_CONDITIONAL(HAVE_INTEL_VULKAN, test "x$HAVE_INTEL_VULKAN" = xyes)
 AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \
                                         "x$HAVE_I965_DRI" = xyes)
 AM_CONDITIONAL(NEED_RADEON_DRM_WINSYS, test "x$HAVE_GALLIUM_R300" = xyes -o \
                                             "x$HAVE_GALLIUM_R600" = xyes -o \
                                             "x$HAVE_GALLIUM_RADEONSI" = xyes)
 AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$NEED_WINSYS_XLIB" = xyes)
 AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$enable_glx" = xgallium-xlib)
 AM_CONDITIONAL(NEED_RADEON_LLVM, test x$NEED_RADEON_LLVM = xyes)
 AM_CONDITIONAL(USE_R600_LLVM_COMPILER, test x$USE_R600_LLVM_COMPILER = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes)
 AM_CONDITIONAL(HAVE_MESA_LLVM, test x$MESA_LLVM = x1)
 AM_CONDITIONAL(USE_VC4_SIMULATOR, test x$USE_VC4_SIMULATOR = xyes)
@@ -2387,9 +2594,10 @@ if test "x$USE_VC4_SIMULATOR" = xyes -a "x$HAVE_GALLIUM_ILO" = xyes; then
 fi
 AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes)
 AM_CONDITIONAL(HAVE_X11_DRIVER, test "x$enable_xlib_glx" = xyes)
 AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_OSMESA, test "x$enable_gallium_osmesa" = xyes)
 AM_CONDITIONAL(HAVE_COMMON_OSMESA, test "x$enable_osmesa" = xyes -o \
                                         "x$enable_gallium_osmesa" = xyes)
 AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 -o "x$asm_arch" = xx86_64)
 AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64)
@@ -2421,6 +2629,29 @@ AC_SUBST([XA_MINOR], $XA_MINOR)
 AC_SUBST([XA_TINY], $XA_TINY)
 AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_TINY")
 AC_SUBST([TIMESTAMP_CMD], '`test $(SOURCE_DATE_EPOCH) && echo $(SOURCE_DATE_EPOCH) || date +%s`')
 AC_ARG_ENABLE(valgrind,
               [AS_HELP_STRING([--enable-valgrind],
                              [Build mesa with valgrind support (default: auto)])],
                              [VALGRIND=$enableval], [VALGRIND=auto])
 if test "x$VALGRIND" != xno; then
 	PKG_CHECK_MODULES(VALGRIND, [valgrind], [have_valgrind=yes], [have_valgrind=no])
 fi
 AC_MSG_CHECKING([whether to enable Valgrind support])
 if test "x$VALGRIND" = xauto; then
 	VALGRIND="$have_valgrind"
 fi
 if test "x$VALGRIND" = "xyes"; then
 	if ! test "x$have_valgrind" = xyes; then
 		AC_MSG_ERROR([Valgrind support required but not present])
 	fi
 	AC_DEFINE([HAVE_VALGRIND], 1, [Use valgrind intrinsics to suppress false warnings])
 fi
 AC_MSG_RESULT([$VALGRIND])
 dnl Restore LDFLAGS and CPPFLAGS
 LDFLAGS="$_SAVE_LDFLAGS"
 CPPFLAGS="$_SAVE_CPPFLAGS"
@@ -2461,6 +2692,7 @@ AC_CONFIG_FILES([Makefile
 		src/gallium/drivers/rbug/Makefile
 		src/gallium/drivers/softpipe/Makefile
 		src/gallium/drivers/svga/Makefile
 		src/gallium/drivers/swr/Makefile
 		src/gallium/drivers/trace/Makefile
 		src/gallium/drivers/vc4/Makefile
 		src/gallium/drivers/virgl/Makefile
@@ -2512,6 +2744,10 @@ AC_CONFIG_FILES([Makefile
 		src/glx/apple/Makefile
 		src/glx/tests/Makefile
 		src/gtest/Makefile
 		src/intel/Makefile
 		src/intel/genxml/Makefile
 		src/intel/isl/Makefile
 		src/intel/vulkan/Makefile
 		src/loader/Makefile
 		src/mapi/Makefile
 		src/mapi/es1api/glesv1_cm.pc
@@ -2538,6 +2774,14 @@ AC_CONFIG_FILES([Makefile
 AC_OUTPUT
 # Fix up dependencies in *.Plo files, where we changed the extension of a
 # source file
 $SED -i -e 's/brw_blorp.cpp/brw_blorp.c/' src/mesa/drivers/dri/i965/.deps/brw_blorp.Plo
 $SED -i -e 's/gen6_blorp.cpp/gen6_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen6_blorp.Plo
 $SED -i -e 's/gen7_blorp.cpp/gen7_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen7_blorp.Plo
 $SED -i -e 's/gen8_blorp.cpp/gen8_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen8_blorp.Plo
 dnl
 dnl Output some configuration info for the user
 dnl
@@ -2576,12 +2820,15 @@ if test "x$enable_dri" != xno; then
         echo "        DRI driver dir:  $DRI_DRIVER_INSTALL_DIR"
 fi
 case "x$enable_glx$enable_xlib_glx" in
 xyesyes)
 case "x$enable_glx" in
 xdri)
     echo "        GLX:             DRI-based"
     ;;
 xxlib)
     echo "        GLX:             Xlib-based"
     ;;
 xyesno)
     echo "        GLX:             DRI-based"
 xgallium-xlib)
     echo "        GLX:             Xlib-based (Gallium)"
     ;;
 *)
     echo "        GLX:             $enable_glx"
@@ -2605,6 +2852,15 @@ if test "$enable_egl" = yes; then
     echo "        EGL drivers:    $egl_drivers"
 fi
 # Vulkan
 echo ""
 if test "x$VULKAN_DRIVERS" != x; then
     echo "        Vulkan drivers:  $VULKAN_DRIVERS"
     echo "        Vulkan ICD dir:  $VULKAN_ICD_INSTALL_DIR"
 else
     echo "        Vulkan drivers:  no"
 fi
 echo ""
 if test "x$MESA_LLVM" = x1; then
     echo "        llvm:            yes"

490

docs/COPYING

View File

@@ -1,490 +0,0 @@
 Some parts of Mesa are copyrighted under the GNU LGPL.  See the
 Mesa/docs/COPYRIGHT file for details.
 The following is the standard GNU copyright file.
 ----------------------------------------------------------------------
 		  GNU LIBRARY GENERAL PUBLIC LICENSE
 		       Version 2, June 1991
  Copyright (C) 1991 Free Software Foundation, Inc.
 Mass Ave, Cambridge, MA 02139, USA
  Everyone is permitted to copy and distribute verbatim copies
  of this license document, but changing it is not allowed.
 [This is the first released version of the library GPL.  It is
  numbered 2 because it goes with version 2 of the ordinary GPL.]
 			    Preamble
   The licenses for most software are designed to take away your
 freedom to share and change it.  By contrast, the GNU General Public
 Licenses are intended to guarantee your freedom to share and change
 free software--to make sure the software is free for all its users.
   This license, the Library General Public License, applies to some
 specially designated Free Software Foundation software, and to any
 other libraries whose authors decide to use it.  You can use it for
 your libraries, too.
   When we speak of free software, we are referring to freedom, not
 price.  Our General Public Licenses are designed to make sure that you
 have the freedom to distribute copies of free software (and charge for
 this service if you wish), that you receive source code or can get it
 if you want it, that you can change the software or use pieces of it
 in new free programs; and that you know you can do these things.
   To protect your rights, we need to make restrictions that forbid
 anyone to deny you these rights or to ask you to surrender the rights.
 These restrictions translate to certain responsibilities for you if
 you distribute copies of the library, or if you modify it.
   For example, if you distribute copies of the library, whether gratis
 or for a fee, you must give the recipients all the rights that we gave
 you.  You must make sure that they, too, receive or can get the source
 code.  If you link a program with the library, you must provide
 complete object files to the recipients so that they can relink them
 with the library, after making changes to the library and recompiling
 it.  And you must show them these terms so they know their rights.
   Our method of protecting your rights has two steps: (1) copyright
 the library, and (2) offer you this license which gives you legal
 permission to copy, distribute and/or modify the library.
   Also, for each distributor's protection, we want to make certain
 that everyone understands that there is no warranty for this free
 library.  If the library is modified by someone else and passed on, we
 want its recipients to know that what they have is not the original
 version, so that any problems introduced by others will not reflect on
 the original authors' reputations.
   Finally, any free program is threatened constantly by software
 patents.  We wish to avoid the danger that companies distributing free
 software will individually obtain patent licenses, thus in effect
 transforming the program into proprietary software.  To prevent this,
 we have made it clear that any patent must be licensed for everyone's
 free use or not licensed at all.
   Most GNU software, including some libraries, is covered by the ordinary
 GNU General Public License, which was designed for utility programs.  This
 license, the GNU Library General Public License, applies to certain
 designated libraries.  This license is quite different from the ordinary
 one; be sure to read it in full, and don't assume that anything in it is
 the same as in the ordinary license.
   The reason we have a separate public license for some libraries is that
 they blur the distinction we usually make between modifying or adding to a
 program and simply using it.  Linking a program with a library, without
 changing the library, is in some sense simply using the library, and is
 analogous to running a utility program or application program.  However, in
 a textual and legal sense, the linked executable is a combined work, a
 derivative of the original library, and the ordinary General Public License
 treats it as such.
   Because of this blurred distinction, using the ordinary General
 Public License for libraries did not effectively promote software
 sharing, because most developers did not use the libraries.  We
 concluded that weaker conditions might promote sharing better.
   However, unrestricted linking of non-free programs would deprive the
 users of those programs of all benefit from the free status of the
 libraries themselves.  This Library General Public License is intended to
 permit developers of non-free programs to use free libraries, while
 preserving your freedom as a user of such programs to change the free
 libraries that are incorporated in them.  (We have not seen how to achieve
 this as regards changes in header files, but we have achieved it as regards
 changes in the actual functions of the Library.)  The hope is that this
 will lead to faster development of free libraries.
   The precise terms and conditions for copying, distribution and
 modification follow.  Pay close attention to the difference between a
 "work based on the library" and a "work that uses the library".  The
 former contains code derived from the library, while the latter only
 works together with the library.
   Note that it is possible for a library to be covered by the ordinary
 General Public License rather than by this special one.
 		  GNU LIBRARY GENERAL PUBLIC LICENSE
    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
 . This License Agreement applies to any software library which
 contains a notice placed by the copyright holder or other authorized
 party saying it may be distributed under the terms of this Library
 General Public License (also called "this License").  Each licensee is
 addressed as "you".
   A "library" means a collection of software functions and/or data
 prepared so as to be conveniently linked with application programs
 (which use some of those functions and data) to form executables.
   The "Library", below, refers to any such software library or work
 which has been distributed under these terms.  A "work based on the
 Library" means either the Library or any derivative work under
 copyright law: that is to say, a work containing the Library or a
 portion of it, either verbatim or with modifications and/or translated
 straightforwardly into another language.  (Hereinafter, translation is
 included without limitation in the term "modification".)
   "Source code" for a work means the preferred form of the work for
 making modifications to it.  For a library, complete source code means
 all the source code for all modules it contains, plus any associated
 interface definition files, plus the scripts used to control compilation
 and installation of the library.
   Activities other than copying, distribution and modification are not
 covered by this License; they are outside its scope.  The act of
 running a program using the Library is not restricted, and output from
 such a program is covered only if its contents constitute a work based
 on the Library (independent of the use of the Library in a tool for
 writing it).  Whether that is true depends on what the Library does
 and what the program that uses the Library does.
 . You may copy and distribute verbatim copies of the Library's
 complete source code as you receive it, in any medium, provided that
 you conspicuously and appropriately publish on each copy an
 appropriate copyright notice and disclaimer of warranty; keep intact
 all the notices that refer to this License and to the absence of any
 warranty; and distribute a copy of this License along with the
 Library.
   You may charge a fee for the physical act of transferring a copy,
 and you may at your option offer warranty protection in exchange for a
 fee.
 . You may modify your copy or copies of the Library or any portion
 of it, thus forming a work based on the Library, and copy and
 distribute such modifications or work under the terms of Section 1
 above, provided that you also meet all of these conditions:
     a) The modified work must itself be a software library.
     b) You must cause the files modified to carry prominent notices
     stating that you changed the files and the date of any change.
     c) You must cause the whole of the work to be licensed at no
     charge to all third parties under the terms of this License.
     d) If a facility in the modified Library refers to a function or a
     table of data to be supplied by an application program that uses
     the facility, other than as an argument passed when the facility
     is invoked, then you must make a good faith effort to ensure that,
     in the event an application does not supply such function or
     table, the facility still operates, and performs whatever part of
     its purpose remains meaningful.
     (For example, a function in a library to compute square roots has
     a purpose that is entirely well-defined independent of the
     application.  Therefore, Subsection 2d requires that any
     application-supplied function or table used by this function must
     be optional: if the application does not supply it, the square
     root function must still compute square roots.)
 These requirements apply to the modified work as a whole.  If
 identifiable sections of that work are not derived from the Library,
 and can be reasonably considered independent and separate works in
 themselves, then this License, and its terms, do not apply to those
 sections when you distribute them as separate works.  But when you
 distribute the same sections as part of a whole which is a work based
 on the Library, the distribution of the whole must be on the terms of
 this License, whose permissions for other licensees extend to the
 entire whole, and thus to each and every part regardless of who wrote
 it.
 Thus, it is not the intent of this section to claim rights or contest
 your rights to work written entirely by you; rather, the intent is to
 exercise the right to control the distribution of derivative or
 collective works based on the Library.
 In addition, mere aggregation of another work not based on the Library
 with the Library (or with a work based on the Library) on a volume of
 a storage or distribution medium does not bring the other work under
 the scope of this License.
 . You may opt to apply the terms of the ordinary GNU General Public
 License instead of this License to a given copy of the Library.  To do
 this, you must alter all the notices that refer to this License, so
 that they refer to the ordinary GNU General Public License, version 2,
 instead of to this License.  (If a newer version than version 2 of the
 ordinary GNU General Public License has appeared, then you can specify
 that version instead if you wish.)  Do not make any other change in
 these notices.
   Once this change is made in a given copy, it is irreversible for
 that copy, so the ordinary GNU General Public License applies to all
 subsequent copies and derivative works made from that copy.
   This option is useful when you wish to copy part of the code of
 the Library into a program that is not a library.
 . You may copy and distribute the Library (or a portion or
 derivative of it, under Section 2) in object code or executable form
 under the terms of Sections 1 and 2 above provided that you accompany
 it with the complete corresponding machine-readable source code, which
 must be distributed under the terms of Sections 1 and 2 above on a
 medium customarily used for software interchange.
   If distribution of object code is made by offering access to copy
 from a designated place, then offering equivalent access to copy the
 source code from the same place satisfies the requirement to
 distribute the source code, even though third parties are not
 compelled to copy the source along with the object code.
 . A program that contains no derivative of any portion of the
 Library, but is designed to work with the Library by being compiled or
 linked with it, is called a "work that uses the Library".  Such a
 work, in isolation, is not a derivative work of the Library, and
 therefore falls outside the scope of this License.
   However, linking a "work that uses the Library" with the Library
 creates an executable that is a derivative of the Library (because it
 contains portions of the Library), rather than a "work that uses the
 library".  The executable is therefore covered by this License.
 Section 6 states terms for distribution of such executables.
   When a "work that uses the Library" uses material from a header file
 that is part of the Library, the object code for the work may be a
 derivative work of the Library even though the source code is not.
 Whether this is true is especially significant if the work can be
 linked without the Library, or if the work is itself a library.  The
 threshold for this to be true is not precisely defined by law.
   If such an object file uses only numerical parameters, data
 structure layouts and accessors, and small macros and small inline
 functions (ten lines or less in length), then the use of the object
 file is unrestricted, regardless of whether it is legally a derivative
 work.  (Executables containing this object code plus portions of the
 Library will still fall under Section 6.)
   Otherwise, if the work is a derivative of the Library, you may
 distribute the object code for the work under the terms of Section 6.
 Any executables containing that work also fall under Section 6,
 whether or not they are linked directly with the Library itself.
 . As an exception to the Sections above, you may also compile or
 link a "work that uses the Library" with the Library to produce a
 work containing portions of the Library, and distribute that work
 under terms of your choice, provided that the terms permit
 modification of the work for the customer's own use and reverse
 engineering for debugging such modifications.
   You must give prominent notice with each copy of the work that the
 Library is used in it and that the Library and its use are covered by
 this License.  You must supply a copy of this License.  If the work
 during execution displays copyright notices, you must include the
 copyright notice for the Library among them, as well as a reference
 directing the user to the copy of this License.  Also, you must do one
 of these things:
     a) Accompany the work with the complete corresponding
     machine-readable source code for the Library including whatever
     changes were used in the work (which must be distributed under
     Sections 1 and 2 above); and, if the work is an executable linked
     with the Library, with the complete machine-readable "work that
     uses the Library", as object code and/or source code, so that the
     user can modify the Library and then relink to produce a modified
     executable containing the modified Library.  (It is understood
     that the user who changes the contents of definitions files in the
     Library will not necessarily be able to recompile the application
     to use the modified definitions.)
     b) Accompany the work with a written offer, valid for at
     least three years, to give the same user the materials
     specified in Subsection 6a, above, for a charge no more
     than the cost of performing this distribution.
     c) If distribution of the work is made by offering access to copy
     from a designated place, offer equivalent access to copy the above
     specified materials from the same place.
     d) Verify that the user has already received a copy of these
     materials or that you have already sent this user a copy.
   For an executable, the required form of the "work that uses the
 Library" must include any data and utility programs needed for
 reproducing the executable from it.  However, as a special exception,
 the source code distributed need not include anything that is normally
 distributed (in either source or binary form) with the major
 components (compiler, kernel, and so on) of the operating system on
 which the executable runs, unless that component itself accompanies
 the executable.
   It may happen that this requirement contradicts the license
 restrictions of other proprietary libraries that do not normally
 accompany the operating system.  Such a contradiction means you cannot
 use both them and the Library together in an executable that you
 distribute.
 . You may place library facilities that are a work based on the
 Library side-by-side in a single library together with other library
 facilities not covered by this License, and distribute such a combined
 library, provided that the separate distribution of the work based on
 the Library and of the other library facilities is otherwise
 permitted, and provided that you do these two things:
     a) Accompany the combined library with a copy of the same work
     based on the Library, uncombined with any other library
     facilities.  This must be distributed under the terms of the
     Sections above.
     b) Give prominent notice with the combined library of the fact
     that part of it is a work based on the Library, and explaining
     where to find the accompanying uncombined form of the same work.
 . You may not copy, modify, sublicense, link with, or distribute
 the Library except as expressly provided under this License.  Any
 attempt otherwise to copy, modify, sublicense, link with, or
 distribute the Library is void, and will automatically terminate your
 rights under this License.  However, parties who have received copies,
 or rights, from you under this License will not have their licenses
 terminated so long as such parties remain in full compliance.
 . You are not required to accept this License, since you have not
 signed it.  However, nothing else grants you permission to modify or
 distribute the Library or its derivative works.  These actions are
 prohibited by law if you do not accept this License.  Therefore, by
 modifying or distributing the Library (or any work based on the
 Library), you indicate your acceptance of this License to do so, and
 all its terms and conditions for copying, distributing or modifying
 the Library or works based on it.
 . Each time you redistribute the Library (or any work based on the
 Library), the recipient automatically receives a license from the
 original licensor to copy, distribute, link with or modify the Library
 subject to these terms and conditions.  You may not impose any further
 restrictions on the recipients' exercise of the rights granted herein.
 You are not responsible for enforcing compliance by third parties to
 this License.
 . If, as a consequence of a court judgment or allegation of patent
 infringement or for any other reason (not limited to patent issues),
 conditions are imposed on you (whether by court order, agreement or
 otherwise) that contradict the conditions of this License, they do not
 excuse you from the conditions of this License.  If you cannot
 distribute so as to satisfy simultaneously your obligations under this
 License and any other pertinent obligations, then as a consequence you
 may not distribute the Library at all.  For example, if a patent
 license would not permit royalty-free redistribution of the Library by
 all those who receive copies directly or indirectly through you, then
 the only way you could satisfy both it and this License would be to
 refrain entirely from distribution of the Library.
 If any portion of this section is held invalid or unenforceable under any
 particular circumstance, the balance of the section is intended to apply,
 and the section as a whole is intended to apply in other circumstances.
 It is not the purpose of this section to induce you to infringe any
 patents or other property right claims or to contest validity of any
 such claims; this section has the sole purpose of protecting the
 integrity of the free software distribution system which is
 implemented by public license practices.  Many people have made
 generous contributions to the wide range of software distributed
 through that system in reliance on consistent application of that
 system; it is up to the author/donor to decide if he or she is willing
 to distribute software through any other system and a licensee cannot
 impose that choice.
 This section is intended to make thoroughly clear what is believed to
 be a consequence of the rest of this License.
 . If the distribution and/or use of the Library is restricted in
 certain countries either by patents or by copyrighted interfaces, the
 original copyright holder who places the Library under this License may add
 an explicit geographical distribution limitation excluding those countries,
 so that distribution is permitted only in or among countries not thus
 excluded.  In such case, this License incorporates the limitation as if
 written in the body of this License.
 . The Free Software Foundation may publish revised and/or new
 versions of the Library General Public License from time to time.
 Such new versions will be similar in spirit to the present version,
 but may differ in detail to address new problems or concerns.
 Each version is given a distinguishing version number.  If the Library
 specifies a version number of this License which applies to it and
 "any later version", you have the option of following the terms and
 conditions either of that version or of any later version published by
 the Free Software Foundation.  If the Library does not specify a
 license version number, you may choose any version ever published by
 the Free Software Foundation.
 . If you wish to incorporate parts of the Library into other free
 programs whose distribution conditions are incompatible with these,
 write to the author to ask for permission.  For software which is
 copyrighted by the Free Software Foundation, write to the Free
 Software Foundation; we sometimes make exceptions for this.  Our
 decision will be guided by the two goals of preserving the free status
 of all derivatives of our free software and of promoting the sharing
 and reuse of software generally.
 			    NO WARRANTY
 . BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
 WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
 EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
 OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
 KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
 LIBRARY IS WITH YOU.  SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
 THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
 . IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
 WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
 AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
 FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
 CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
 LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
 RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
 FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
 SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
 DAMAGES.
 		     END OF TERMS AND CONDITIONS
      Appendix: How to Apply These Terms to Your New Libraries
   If you develop a new library, and you want it to be of the greatest
 possible use to the public, we recommend making it free software that
 everyone can redistribute and change.  You can do so by permitting
 redistribution under these terms (or, alternatively, under the terms of the
 ordinary General Public License).
   To apply these terms, attach the following notices to the library.  It is
 safest to attach them to the start of each source file to most effectively
 convey the exclusion of warranty; and each file should have at least the
 "copyright" line and a pointer to where the full notice is found.
     <one line to give the library's name and a brief idea of what it does.>
     Copyright (C) <year>  <name of author>
     This library is free software; you can redistribute it and/or
     modify it under the terms of the GNU Library General Public
     License as published by the Free Software Foundation; either
     version 2 of the License, or (at your option) any later version.
     This library is distributed in the hope that it will be useful,
     but WITHOUT ANY WARRANTY; without even the implied warranty of
     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
     Library General Public License for more details.
     You should have received a copy of the GNU Library General Public
     License along with this library; if not, write to the Free
     Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
 Also add information on how to contact you by electronic and paper mail.
 You should also get your employer (if you work as a programmer) or your
 school, if any, to sign a "copyright disclaimer" for the library, if
 necessary.  Here is a sample; alter the names:
   Yoyodyne, Inc., hereby disclaims all copyright interest in the
   library `Frob' (a library for tweaking knobs) written by James Random Hacker.
   <signature of Ty Coon>, 1 April 1990
   Ty Coon, President of Vice
 That's all there is to it!

378

docs/GL3.txt

View File

@@ -1,13 +1,28 @@
 # Status of OpenGL extensions in Mesa
 Status of OpenGL 3.x features in Mesa
 Here's how to read this file:
 all DONE: <driver>, ...
     All the extensions are done for the given list of drivers.
 Note: when an item is marked as "DONE" it means all the core Mesa
 infrastructure is complete but it may be the case that few (if any) drivers
 implement the features.
 DONE
     The extension is done for Mesa and no implementation is necessary on the
     driver-side.
 DONE ()
     The extension is done for Mesa and all the drivers in the "all DONE" list.
 OpenGL Core and Compatibility context support
 DONE (<driver>, ...)
     The extension is done for Mesa, all the drivers in the "all DONE" list, and
     all the drivers in the brackets.
 in progress
     The extension is started but not finished yet.
 not started
     The extension isn't started yet.
 # OpenGL Core and Compatibility context support
 OpenGL 3.1 and later versions are only supported with the Core profile.
 There are no plans to support GL_ARB_compatibility. The last supported OpenGL
@@ -15,249 +30,248 @@ version with all deprecated features is 3.0. Some of the later GL features
 are exposed in the 3.0 context as extensions.
 Feature                                               Status
 ----------------------------------------------------- ------------------------
 Feature                                                 Status
 ------------------------------------------------------- ------------------------
 GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
   glBindFragDataLocation, glGetFragDataLocation         DONE
   Conditional rendering (GL_NV_conditional_render)      DONE ()
   Map buffer subranges (GL_ARB_map_buffer_range)        DONE ()
   Clamping controls (GL_ARB_color_buffer_float)         DONE ()
   Float textures, renderbuffers (GL_ARB_texture_float)  DONE ()
   GL_NV_conditional_render (Conditional rendering)      DONE ()
   GL_ARB_map_buffer_range (Map buffer subranges)        DONE ()
   GL_ARB_color_buffer_float (Clamping controls)         DONE ()
   GL_ARB_texture_float (Float textures, renderbuffers)  DONE ()
   GL_EXT_packed_float                                   DONE ()
   GL_EXT_texture_shared_exponent                        DONE ()
   Float depth buffers (GL_ARB_depth_buffer_float)       DONE ()
   Framebuffer objects (GL_ARB_framebuffer_object)       DONE ()
   GL_ARB_depth_buffer_float (Float depth buffers)       DONE ()
   GL_ARB_framebuffer_object (Framebuffer objects)       DONE ()
   GL_ARB_half_float_pixel                               DONE (all drivers)
   GL_ARB_half_float_vertex                              DONE ()
   GL_EXT_texture_integer                                DONE ()
   GL_EXT_texture_array                                  DONE ()
   Per-buffer blend and masks (GL_EXT_draw_buffers2)     DONE ()
   GL_EXT_draw_buffers2 (Per-buffer blend and masks)     DONE ()
   GL_EXT_texture_compression_rgtc                       DONE ()
   GL_ARB_texture_rg                                     DONE ()
   Transform feedback (GL_EXT_transform_feedback)        DONE ()
   Vertex array objects (GL_ARB_vertex_array_object)     DONE ()
   sRGB framebuffer format (GL_EXT_framebuffer_sRGB)     DONE ()
   GL_EXT_transform_feedback (Transform feedback)        DONE ()
   GL_ARB_vertex_array_object (Vertex array objects)     DONE ()
   GL_EXT_framebuffer_sRGB (sRGB framebuffer format)     DONE ()
   glClearBuffer commands                                DONE
   glGetStringi command                                  DONE
   glTexParameterI, glGetTexParameterI commands          DONE
   glVertexAttribI commands                              DONE
   Depth format cube textures                            DONE ()
   GLX_ARB_create_context (GLX 1.4 is required)          DONE
   Multisample anti-aliasing                             DONE (llvmpipe (*), softpipe (*))
   Multisample anti-aliasing                             DONE (llvmpipe (*), softpipe (*), swr (*))
 (*) llvmpipe and softpipe have fake Multisample anti-aliasing support
 (*) llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
 GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
   Forward compatible context support/deprecations       DONE ()
   Instanced drawing (GL_ARB_draw_instanced)             DONE ()
   Buffer copying (GL_ARB_copy_buffer)                   DONE ()
   Primitive restart (GL_NV_primitive_restart)           DONE ()
   GL_ARB_draw_instanced (Instanced drawing)             DONE ()
   GL_ARB_copy_buffer (Buffer copying)                   DONE ()
   GL_NV_primitive_restart (Primitive restart)           DONE ()
 vertex texture image units                         DONE ()
   Texture buffer objs (GL_ARB_texture_buffer_object)    DONE for OpenGL 3.1 contexts ()
   Rectangular textures (GL_ARB_texture_rectangle)       DONE ()
   Uniform buffer objs (GL_ARB_uniform_buffer_object)    DONE ()
   Signed normalized textures (GL_EXT_texture_snorm)     DONE ()
   GL_ARB_texture_buffer_object (Texture buffer objs)    DONE (for OpenGL 3.1 contexts)
   GL_ARB_texture_rectangle (Rectangular textures)       DONE ()
   GL_ARB_uniform_buffer_object (Uniform buffer objs)    DONE ()
   GL_EXT_texture_snorm (Signed normalized textures)     DONE ()
 GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
   Core/compatibility profiles                           DONE
   Geometry shaders                                      DONE ()
   BGRA vertex order (GL_ARB_vertex_array_bgra)          DONE ()
   Base vertex offset(GL_ARB_draw_elements_base_vertex)  DONE ()
   Frag shader coord (GL_ARB_fragment_coord_conventions) DONE ()
   Provoking vertex (GL_ARB_provoking_vertex)            DONE ()
   Seamless cubemaps (GL_ARB_seamless_cube_map)          DONE ()
   Multisample textures (GL_ARB_texture_multisample)     DONE ()
   Frag depth clamp (GL_ARB_depth_clamp)                 DONE ()
   Fence objects (GL_ARB_sync)                           DONE ()
   GL_ARB_vertex_array_bgra (BGRA vertex order)          DONE (swr)
   GL_ARB_draw_elements_base_vertex (Base vertex offset) DONE (swr)
   GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (swr)
   GL_ARB_provoking_vertex (Provoking vertex)            DONE (swr)
   GL_ARB_seamless_cube_map (Seamless cubemaps)          DONE (swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (swr)
   GL_ARB_depth_clamp (Frag depth clamp)                 DONE (swr)
   GL_ARB_sync (Fence objects)                           DONE (swr)
   GLX_ARB_create_context_profile                        DONE
 GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
   GL_ARB_blend_func_extended                            DONE ()
   GL_ARB_blend_func_extended                            DONE (swr)
   GL_ARB_explicit_attrib_location                       DONE (all drivers that support GLSL)
   GL_ARB_occlusion_query2                               DONE ()
   GL_ARB_occlusion_query2                               DONE (swr)
   GL_ARB_sampler_objects                                DONE (all drivers)
   GL_ARB_shader_bit_encoding                            DONE ()
   GL_ARB_texture_rgb10_a2ui                             DONE ()
   GL_ARB_texture_swizzle                                DONE ()
   GL_ARB_timer_query                                    DONE ()
   GL_ARB_instanced_arrays                               DONE ()
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE ()
   GL_ARB_shader_bit_encoding                            DONE (swr)
   GL_ARB_texture_rgb10_a2ui                             DONE (swr)
   GL_ARB_texture_swizzle                                DONE (swr)
   GL_ARB_timer_query                                    DONE (swr)
   GL_ARB_instanced_arrays                               DONE (swr)
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE (swr)
 GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
   GL_ARB_draw_buffers_blend                            DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_draw_indirect                                 DONE (i965, llvmpipe, softpipe)
   GL_ARB_gpu_shader5                                   DONE (i965)
   - 'precise' qualifier                                DONE
   - Dynamically uniform sampler array indices          DONE (softpipe)
   - Dynamically uniform UBO array indices              DONE ()
   - Implicit signed -> unsigned conversions            DONE
   - Fused multiply-add                                 DONE ()
   - Packing/bitfield/conversion functions              DONE (softpipe)
   - Enhanced textureGather                             DONE (softpipe)
   - Geometry shader instancing                         DONE (llvmpipe, softpipe)
   - Geometry shader multiple streams                   DONE ()
   - Enhanced per-sample shading                        DONE ()
   - Interpolation functions                            DONE ()
   - New overload resolution rules                      DONE
   GL_ARB_gpu_shader_fp64                               DONE (llvmpipe, softpipe)
   GL_ARB_sample_shading                                DONE (i965, nv50)
   GL_ARB_shader_subroutine                             DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_tessellation_shader                           DONE (i965)
   GL_ARB_texture_buffer_object_rgb32                   DONE (i965, llvmpipe, softpipe)
   GL_ARB_texture_cube_map_array                        DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_texture_gather                                DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_texture_query_lod                             DONE (i965, nv50, softpipe)
   GL_ARB_transform_feedback2                           DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_transform_feedback3                           DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_draw_buffers_blend                             DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_draw_indirect                                  DONE (i965, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader5                                    DONE (i965)
   - 'precise' qualifier                                 DONE
   - Dynamically uniform sampler array indices           DONE (softpipe)
   - Dynamically uniform UBO array indices               DONE ()
   - Implicit signed -> unsigned conversions             DONE
   - Fused multiply-add                                  DONE ()
   - Packing/bitfield/conversion functions               DONE (softpipe)
   - Enhanced textureGather                              DONE (softpipe)
   - Geometry shader instancing                          DONE (llvmpipe, softpipe)
   - Geometry shader multiple streams                    DONE ()
   - Enhanced per-sample shading                         DONE ()
   - Interpolation functions                             DONE ()
   - New overload resolution rules                       DONE
   GL_ARB_gpu_shader_fp64                                DONE (i965/gen8+, llvmpipe, softpipe)
   GL_ARB_sample_shading                                 DONE (i965, nv50)
   GL_ARB_shader_subroutine                              DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_tessellation_shader                            DONE (i965)
   GL_ARB_texture_buffer_object_rgb32                    DONE (i965, llvmpipe, softpipe, swr)
   GL_ARB_texture_cube_map_array                         DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_texture_gather                                 DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_query_lod                              DONE (i965, nv50, softpipe)
   GL_ARB_transform_feedback2                            DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback3                            DONE (i965, nv50, llvmpipe, softpipe, swr)
 GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
   GL_ARB_ES2_compatibility                             DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_get_program_binary                            DONE (0 binary formats)
   GL_ARB_separate_shader_objects                       DONE (all drivers)
   GL_ARB_shader_precision                              DONE (all drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bit                           DONE (llvmpipe, softpipe)
   GL_ARB_viewport_array                                DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_ES2_compatibility                              DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_get_program_binary                             DONE (0 binary formats)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_shader_precision                               DONE (all drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bit                            DONE (i965/gen8+, llvmpipe, softpipe)
   GL_ARB_viewport_array                                 DONE (i965, nv50, llvmpipe, softpipe)
 GL 4.2, GLSL 4.20:
 GL 4.2, GLSL 4.20 -- all DONE: radeonsi
   GL_ARB_texture_compression_bptc                      DONE (i965, nvc0, r600, radeonsi)
   GL_ARB_compressed_texture_pixel_storage              DONE (all drivers)
   GL_ARB_shader_atomic_counters                        DONE (i965, nvc0)
   GL_ARB_texture_storage                               DONE (all drivers)
   GL_ARB_transform_feedback_instanced                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_base_instance                                 DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_shader_image_load_store                       DONE (i965)
   GL_ARB_conservative_depth                            DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                      DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing                      DONE (all drivers)
   GL_ARB_internalformat_query                          DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_map_buffer_alignment                          DONE (all drivers)
   GL_ARB_texture_compression_bptc                       DONE (i965, nvc0, r600, radeonsi)
   GL_ARB_compressed_texture_pixel_storage               DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_texture_storage                                DONE (all drivers)
   GL_ARB_transform_feedback_instanced                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_base_instance                                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_shader_image_load_store                        DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_conservative_depth                             DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                       DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_internalformat_query                           DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_map_buffer_alignment                           DONE (all drivers)
 GL 4.3, GLSL 4.30:
   GL_ARB_arrays_of_arrays                              DONE (all drivers that support GLSL 1.30)
   GL_ARB_ES3_compatibility                             DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                           DONE (all drivers)
   GL_ARB_compute_shader                                DONE (i965)
   GL_ARB_copy_image                                    DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_KHR_debug                                         DONE (all drivers)
   GL_ARB_explicit_uniform_location                     DONE (all drivers that support GLSL)
   GL_ARB_fragment_layer_viewport                       DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
   GL_ARB_framebuffer_no_attachments                    DONE (i965)
   GL_ARB_internalformat_query2                         in progress (elima)
   GL_ARB_invalidate_subdata                            DONE (all drivers)
   GL_ARB_multi_draw_indirect                           DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_program_interface_query                       DONE (all drivers)
   GL_ARB_robust_buffer_access_behavior                 not started
   GL_ARB_shader_image_size                             DONE (i965)
   GL_ARB_shader_storage_buffer_object                  DONE (i965, nvc0)
   GL_ARB_stencil_texturing                             DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_buffer_range                          DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
   GL_ARB_texture_query_levels                          DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                   DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_vertex_attrib_binding                         DONE (all drivers)
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_ES3_compatibility                              DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                            DONE (all drivers)
   GL_ARB_compute_shader                                 DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_KHR_debug                                          DONE (all drivers)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_fragment_layer_viewport                        DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, nvc0, r600, radeonsi, softpipe)
   GL_ARB_internalformat_query2                          DONE (all drivers)
   GL_ARB_invalidate_subdata                             DONE (all drivers)
   GL_ARB_multi_draw_indirect                            DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_robust_buffer_access_behavior                  DONE (i965, nvc0, radeonsi)
   GL_ARB_shader_image_size                              DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_stencil_texturing                              DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_range                           DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
   GL_ARB_texture_query_levels                           DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
 GL 4.4, GLSL 4.40:
   GL_MAX_VERTEX_ATTRIB_STRIDE                          DONE (all drivers)
   GL_ARB_buffer_storage                                DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_clear_texture                                 DONE (i965, nv50, nvc0)
   GL_ARB_enhanced_layouts                              in progress (Timothy)
   - compile-time constant expressions                  DONE
   - explicit byte offsets for blocks                   in progress
   - forced alignment within blocks                     in progress
   - specified vec4-slot component numbers              in progress
   - specified transform/feedback layout                in progress
   - input/output block locations                       DONE
   GL_ARB_multi_bind                                    DONE (all drivers)
   GL_ARB_query_buffer_object                           DONE (nvc0)
   GL_ARB_texture_mirror_clamp_to_edge                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_stencil8                              DONE (nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_vertex_type_10f_11f_11f_rev                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_MAX_VERTEX_ATTRIB_STRIDE                           DONE (all drivers)
   GL_ARB_buffer_storage                                 DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_clear_texture                                  DONE (i965, nv50, nvc0)
   GL_ARB_enhanced_layouts                               in progress (Timothy)
   - compile-time constant expressions                   DONE
   - explicit byte offsets for blocks                    DONE
   - forced alignment within blocks                      DONE
   - specified vec4-slot component numbers               in progress
   - specified transform/feedback layout                 DONE
   - input/output block locations                        DONE
   GL_ARB_multi_bind                                     DONE (all drivers)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+, nvc0)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_stencil8                               DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
 GL 4.5, GLSL 4.50:
   GL_ARB_ES3_1_compatibility                           not started
   GL_ARB_clip_control                                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_conditional_render_inverted                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_cull_distance                                 in progress (Tobias)
   GL_ARB_derivative_control                            DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_direct_state_access                           DONE (all drivers)
   GL_ARB_get_texture_sub_image                         DONE (all drivers)
   GL_ARB_shader_texture_image_samples                  DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_texture_barrier                               DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_KHR_context_flush_control                         DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robust_buffer_access_behavior                 not started
   GL_KHR_robustness                                    90% done (the ARB variant)
   GL_EXT_shader_integer_mix                            DONE (all drivers that support GLSL)
   GL_ARB_ES3_1_compatibility                            DONE (nvc0, radeonsi)
   GL_ARB_clip_control                                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_cull_distance                                  DONE (i965, nv50, nvc0, llvmpipe, softpipe)
   GL_ARB_derivative_control                             DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_direct_state_access                            DONE (all drivers)
   GL_ARB_get_texture_sub_image                          DONE (all drivers)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_texture_barrier                                DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_KHR_context_flush_control                          DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robustness                                     DONE (i965)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1
   GL_ARB_arrays_of_arrays                              DONE (all drivers that support GLSL 1.30)
   GL_ARB_compute_shader                                DONE (i965)
   GL_ARB_draw_indirect                                 DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_explicit_uniform_location                     DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                    DONE (i965)
   GL_ARB_program_interface_query                       DONE (all drivers)
   GL_ARB_shader_atomic_counters                        DONE (i965, nvc0)
   GL_ARB_shader_image_load_store                       DONE (i965)
   GL_ARB_shader_image_size                             DONE (i965)
   GL_ARB_shader_storage_buffer_object                  DONE (i965, nvc0)
   GL_ARB_shading_language_packing                      DONE (all drivers)
   GL_ARB_separate_shader_objects                       DONE (all drivers)
   GL_ARB_stencil_texturing                             DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   Multisample textures (GL_ARB_texture_multisample)    DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_storage_multisample                   DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_vertex_attrib_binding                         DONE (all drivers)
   GS5 Enhanced textureGather                           DONE (i965, nvc0, r600, radeonsi)
   GS5 Packing/bitfield/conversion functions            DONE (i965, nvc0, r600, radeonsi)
   GL_EXT_shader_integer_mix                            DONE (all drivers that support GLSL)
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_compute_shader                                 DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_draw_indirect                                  DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, nvc0, r600, radeonsi, softpipe)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_image_load_store                        DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_image_size                              DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_stencil_texturing                              DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
   GS5 Enhanced textureGather                            DONE (i965, nvc0, r600, radeonsi)
   GS5 Packing/bitfield/conversion functions             DONE (i965, nvc0, r600, radeonsi)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
   Additional functionality not covered above:
       glMemoryBarrierByRegion                          DONE
       glGetTexLevelParameter[fi]v - needs updates      DONE
       glMemoryBarrierByRegion                           DONE
       glGetTexLevelParameter[fi]v - needs updates       DONE
       glGetBooleani_v - restrict to GLES enums
       gl_HelperInvocation support                      DONE (i965, nvc0, r600)
       gl_HelperInvocation support                       DONE (i965, nvc0, r600, radeonsi)
 GLES3.2, GLSL ES 3.2
   GL_EXT_color_buffer_float                            DONE (all drivers)
   GL_KHR_blend_equation_advanced                       not started
   GL_KHR_debug                                         DONE (all drivers)
   GL_KHR_robustness                                    90% done (the ARB variant)
   GL_KHR_texture_compression_astc_ldr                  DONE (i965/gen9+)
   GL_OES_copy_image                                    not started (based on GL_ARB_copy_image, which is done for some drivers)
   GL_OES_draw_buffers_indexed                          not started
   GL_OES_draw_elements_base_vertex                     DONE (all drivers)
   GL_OES_geometry_shader                               started (Marta)
   GL_OES_gpu_shader5                                   not started (based on parts of GL_ARB_gpu_shader5, which is done for some drivers)
   GL_OES_primitive_bounding box                        not started
   GL_OES_sample_shading                                not started (based on parts of GL_ARB_sample_shading, which is done for some drivers)
   GL_OES_sample_variables                              not started (based on parts of GL_ARB_sample_shading, which is done for some drivers)
   GL_OES_shader_image_atomic                           not started (based on parts of GL_ARB_shader_image_load_store, which is done for some drivers)
   GL_OES_shader_io_blocks                              not started (based on parts of GLSL 1.50, which is done)
   GL_OES_shader_multisample_interpolation              not started (based on parts of GL_ARB_gpu_shader5, which is done)
   GL_OES_tessellation_shader                           not started (based on GL_ARB_tessellation_shader, which is done for some drivers)
   GL_OES_texture_border_clamp                          not started (based on GL_ARB_texture_border_clamp, which is done)
   GL_OES_texture_buffer                                not started (based on GL_ARB_texture_buffer_object, GL_ARB_texture_buffer_range, and GL_ARB_texture_buffer_object_rgb32 that are all done)
   GL_OES_texture_cube_map_array                        not started (based on GL_ARB_texture_cube_map_array, which is done for all drivers)
   GL_OES_texture_stencil8                              DONE (all drivers that support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array          DONE (all drivers that support GL_ARB_texture_multisample)
   GL_EXT_color_buffer_float                             DONE (all drivers)
   GL_KHR_blend_equation_advanced                        not started
   GL_KHR_debug                                          DONE (all drivers)
   GL_KHR_robustness                                     DONE (i965)
   GL_KHR_texture_compression_astc_ldr                   DONE (i965/gen9+)
   GL_OES_copy_image                                     DONE (i965)
   GL_OES_draw_buffers_indexed                           DONE (all drivers that support GL_ARB_draw_buffers_blend)
   GL_OES_draw_elements_base_vertex                      DONE (all drivers)
   GL_OES_geometry_shader                                started (idr)
   GL_OES_gpu_shader5                                    DONE (all drivers that support GL_ARB_gpu_shader5)
   GL_OES_primitive_bounding_box                         not started
   GL_OES_sample_shading                                 DONE (i965, nvc0, r600, radeonsi)
   GL_OES_sample_variables                               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_shader_image_atomic                            DONE (all drivers that support GL_ARB_shader_image_load_store)
   GL_OES_shader_io_blocks                               DONE (i965/gen8+, nvc0, radeonsi)
   GL_OES_shader_multisample_interpolation               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_tessellation_shader                            started (Ken)
   GL_OES_texture_border_clamp                           DONE (all drivers)
   GL_OES_texture_buffer                                 DONE (i965, nvc0, radeonsi)
   GL_OES_texture_cube_map_array                         not started (based on GL_ARB_texture_cube_map_array, which is done for all drivers)
   GL_OES_texture_stencil8                               DONE (all drivers that support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array           DONE (all drivers that support GL_ARB_texture_multisample)
 More info about these features and the work involved can be found at
 http://dri.freedesktop.org/wiki/MissingFunctionality

									
										4

docs/download.html
									
												View File
												
				@@ -18,7 +18,9 @@

				<p>

				Primary Mesa download site:

				<a href="ftp://ftp.freedesktop.org/pub/mesa/">freedesktop.org</a> (FTP)

				<a href="ftp://ftp.freedesktop.org/pub/mesa/">ftp.freedesktop.org</a> (FTP)

				or <a href="https://mesa.freedesktop.org/archive/">mesa.freedesktop.org</a>

				(HTTP).

				</p>

				<p>

									
										8

docs/egl.html
									
												View File
												
				@@ -89,9 +89,11 @@ types such as <code>EGLNativeDisplayType</code> or

				<p>The available platforms are <code>x11</code>, <code>drm</code>,

				<code>wayland</code>, <code>surfaceless</code>, <code>android</code>,

				and <code>haiku</code>.  The <code>android</code> platform

				can only be built as a system component, part of AOSP, while the

				<code>haiku</code> platform can only be built with SCons.

				and <code>haiku</code>.

				The <code>android</code> platform can either be built as a system

				component, part of AOSP, using <code>Android.mk</code> files, or

				cross-compiled using appropriate <code>configure</code> options.

				The <code>haiku</code> platform can only be built with SCons.

				Unless for special needs, the build system should

				select the right platforms automatically.</p>

									
										4

docs/envvars.html
									
												View File
												
				@@ -163,6 +163,10 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.

				   <li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>

				   <li>nodualobj - suppress generation of dual-object geometry shader code</li>

				   <li>optimizer - dump shader assembly to files at each optimization pass and iteration that make progress</li>

				   <li>vec4 - force vec4 mode in vertex shader</li>

				   <li>spill_fs - force spilling of all registers in the scalar backend (useful to debug spilling code)</li>

				   <li>spill_vec4 - force spilling of all registers in the vec4 backend (useful to debug spilling code)</li>

				   <li>norbc - disable single sampled render buffer compression</li>

				</ul>

				</ul>

									
										27

docs/index.html
									
												View File
												
				@@ -16,6 +16,33 @@

				<h1>News</h1>

				<h2>May 9, 2016</h2>

				<p>

				<a href="relnotes/11.1.4.html">Mesa 11.1.4</a> and

				<a href="relnotes/11.2.2.html">Mesa 11.2.2</a> are released.

				These are bug-fix releases from the 11.1 and 11.2 branches, respectively.

				<br>

				NOTE: It is anticipated that 11.1.4 will be the final release in the 11.1.4

				series. Users of 11.1 are encouraged to migrate to the 11.2 series in order

				to obtain future fixes.

				</p>

				<h2>April 17, 2016</h2>

				<p>

				<a href="relnotes/11.1.3.html">Mesa 11.1.3</a> and

				<a href="relnotes/11.2.1.html">Mesa 11.2.1</a> are released.

				These are bug-fix releases from the 11.1 and 11.2 branches, respectively.

				</p>

				<h2>April 4, 2016</h2>

				<p>

				<a href="relnotes/11.2.0.html">Mesa 11.2.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>February 10, 2016</h2>

				<p>

				<a href="relnotes/11.1.2.html">Mesa 11.1.2</a> is released.

									
										3

docs/install.html
									
												View File
												
				@@ -73,8 +73,7 @@ The following are required for DRI-based hardware acceleration with Mesa:

				<ul>

				<li><a href="http://xorg.freedesktop.org/releases/individual/proto/">

				dri2proto</a> version 2.6 or later

				<li><a href="http://dri.freedesktop.org/libdrm/">libDRM</a>

				version 2.4.33 or later

				<li><a href="http://dri.freedesktop.org/libdrm/">libDRM</a> latest version

				<li>Xorg server version 1.5 or later

				<li>Linux 2.6.28 or later

				</ul>

									
										14

docs/license.html
									
												View File
												
				@@ -46,10 +46,10 @@ library</em>. <br>

				<p>

				The Mesa distribution consists of several components.  Different copyrights

				and licenses apply to different components.  For example, some demo programs

				are copyrighted by SGI, some of the Mesa device drivers are copyrighted by

				their authors.  See below for a list of Mesa's main components and the license

				for each.

				and licenses apply to different components.

				For example, the GLX client code uses the SGI Free Software License B, and

				some of the Mesa device drivers are copyrighted by their authors.

				See below for a list of Mesa's main components and the license for each.

				</p>

				<p>

				The core Mesa library is licensed according to the terms of the MIT license.

				@@ -97,13 +97,17 @@ and their respective licenses.

				<pre>

				Component         Location               License

				------------------------------------------------------------------

				Main Mesa code    src/mesa/              Mesa (MIT)

				Main Mesa code    src/mesa/              MIT

				Device drivers    src/mesa/drivers/*     MIT, generally

				Gallium code      src/gallium/           MIT

				Ext headers       include/GL/glext.h     Khronos

				                  include/GL/glxext.h

				GLX client code   src/glx/               SGI Free Software License B

				C11 thread        include/c11/threads*.h Boost (permissive)

				emulation

				</pre>

									
										5

docs/relnotes.html
									
												View File
												
				@@ -21,6 +21,11 @@ The release notes summarize what's new or changed in each Mesa release.

				</p>

				<ul>

				<li><a href="relnotes/11.2.2.html">11.2.2 release notes</a>

				<li><a href="relnotes/11.1.4.html">11.1.4 release notes</a>

				<li><a href="relnotes/11.2.1.html">11.2.1 release notes</a>

				<li><a href="relnotes/11.1.3.html">11.1.3 release notes</a>

				<li><a href="relnotes/11.2.0.html">11.2.0 release notes</a>

				<li><a href="relnotes/11.1.2.html">11.1.2 release notes</a>

				<li><a href="relnotes/11.0.9.html">11.0.9 release notes</a>

				<li><a href="relnotes/11.1.1.html">11.1.1 release notes</a>

									
										319

docs/relnotes/11.1.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,319 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.1.3 Release Notes / April 17, 2016</h1>

				<p>

				Mesa 11.1.3 is a bug fix release which fixes bugs found since the 11.1.2 release.

				</p>

				<p>

				Mesa 11.1.3 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				9e86c72b6b2e8adb53c1c4a0002ab267b45094d753eb9404b1db34f81ce94ccf  mesa-11.1.3.tar.gz

				51f6658a214d75e4d9f05207586d7ed56ebba75c6b10841176fb6675efa310ac  mesa-11.1.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=27512">Bug 27512</a> - Illegal instruction _mesa_x86_64_transform_points4_general</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92193">Bug 92193</a> - [SKL] ES2-CTS.gtf.GL2ExtensionTests.compressed_astc_texture.compressed_astc_texture fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93358">Bug 93358</a> - [HSW] Unreal Elemental demo - assertion error in copy_image_with_blitter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93418">Bug 93418</a> - Geometry Shaders output wrong vertices on Sandy Bridge</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93524">Bug 93524</a> - Clover doesn't build</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93667">Bug 93667</a> - Crash in eglCreateImageKHR with huge texture size</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93813">Bug 93813</a> - Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94050">Bug 94050</a> - test_vec4_register_coalesce regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94073">Bug 94073</a> - Miscompilation of abs_vec3_vert_xvary_ref.vert in WebGL conformance</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94088">Bug 94088</a> - [llvmpipe] SIGFPE pthread_barrier_destroy.c:40</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94193">Bug 94193</a> - [llvmpipe] Line antialiasing looks different when GL_LINE_STIPPLE is enabled with pattern 0xffff</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94195">Bug 94195</a> - [llvmpipe] Does not build with LLVM 3.7.x on Windows</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94388">Bug 94388</a> - r600_blit.c:281: r600_decompress_depth_textures: Assertion `tex-&gt;is_depth &amp;&amp; !tex-&gt;is_flushing_texture' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94412">Bug 94412</a> - Trine 3 misrender</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94481">Bug 94481</a> - softpipe - access violation in img_filter_2d_nearest</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94595">Bug 94595</a> - [Mesa AMD&amp;swrast] Texture views attached as framebuffers return their viewed tecture's color encoding and render incorrectly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94954">Bug 94954</a> - test_vec4_copy_propagation fails in `make check`</li>

				</ul>

				<h2>Changes</h2>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>i965: Fix assert conditions for src/dst x/y offsets</li>

				</ul>

				<p>Ben Widawsky (2):</p>

				<ul>

				  <li>i965: Make sure we blit a full compressed block</li>

				  <li>i965/skl: Add two missing device IDs</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>mesa: fix incorrect viewport position when GL_CLIP_ORIGIN = GL_LOWER_LEFT</li>

				</ul>

				<p>Chris Forbes (1):</p>

				<ul>

				  <li>i965/blorp: Fix hiz ops on MSAA surfaces</li>

				</ul>

				<p>Christian König (1):</p>

				<ul>

				  <li>radeon/uvd: disable MPEG1</li>

				</ul>

				<p>Christian Schmidbauer (1):</p>

				<ul>

				  <li>st/nine: specify WINAPI only for i386 and amd64</li>

				</ul>

				<p>Daniel Czarnowski (3):</p>

				<ul>

				  <li>egl_dri2: NULL check for xcb_dri2_get_buffers_reply()</li>

				  <li>egl_dri2: set correct error code if swapbuffers fails</li>

				  <li>egl: support EGL_LARGEST_PBUFFER in eglCreatePbufferSurface(...)</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>mesa/fbobject: propogate Layered when reusing attachments.</li>

				</ul>

				<p>Derek Foreman (1):</p>

				<ul>

				  <li>egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamage</li>

				</ul>

				<p>Dongwon Kim (1):</p>

				<ul>

				  <li>egl: move Null check to eglGetSyncAttribKHR to prevent Segfault</li>

				</ul>

				<p>Emil Velikov (10):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.1.2</li>

				  <li>get-pick-list.sh: Require explicit "11.1" for nominating stable patches</li>

				  <li>cherry-ignore: do not pick nv50/ir commit</li>

				  <li>automake: add nine to make distcheck</li>

				  <li>install-gallium-links: port changes from install-lib-links</li>

				  <li>automake: add more missing options for make distcheck</li>

				  <li>mesa; add get-extra-pick-list.sh script into bin/</li>

				  <li>egl/x11: check the return value of xcb_dri2_get_buffers_reply()</li>

				  <li>nvc/ir: remove duplicate variable declaration</li>

				  <li>Update version to 11.1.3</li>

				</ul>

				<p>Francisco Jerez (4):</p>

				<ul>

				  <li>i965: Reupload push and pull constants when we get new shader image unit state.</li>

				  <li>i965/fs: Add missing analysis invalidation in opt_sampler_eot().</li>

				  <li>i965/fs: Add missing analysis invalidation in fixup_3src_null_dest().</li>

				  <li>i965/vec4: Consider removal of no-op MOVs as progress during register coalesce.</li>

				</ul>

				<p>Ilia Mirkin (21):</p>

				<ul>

				  <li>nvc0/ir: fix converting between predicate and gpr</li>

				  <li>nvc0: add some missing PUSH_SPACE's</li>

				  <li>nvc0: avoid negatives in PUSH_SPACE argument</li>

				  <li>glsl: make sure builtins are initialized before getting the shader</li>

				  <li>glsl: return cloned signature, not the builtin one</li>

				  <li>nv50/ir: fix quadop emission in the presence of predication</li>

				  <li>st/mesa: fix up result_src.type when doing i2u/u2i conversions</li>

				  <li>meta/copy_image: use precomputed dst_internal_format to avoid segfault</li>

				  <li>st/mesa: force depth mode to GL_RED for sized depth/stencil formats</li>

				  <li>glx: update to updated version of EXT_create_context_es2_profile</li>

				  <li>nv50,nvc0: bump minimum texture buffer offset alignment</li>

				  <li>nvc0: reset TFB bufctx when we no longer hold a reference to the buffers</li>

				  <li>glsl: avoid stack smashing when there are too many attributes</li>

				  <li>nvc0: fix blit triangle size to fully cover FB's &gt; 8192x8192</li>

				  <li>nv50: reset TFB bufctx when we no longer hold a reference to the buffers</li>

				  <li>nv50/ir: force-enable derivatives on TXD ops</li>

				  <li>st/mesa: only minify depth for 3d targets</li>

				  <li>nv50/ir: fix indirect texturing for non-array textures on nvc0</li>

				  <li>nvc0/ir: fix picking of coordinates from tex instruction for textureGrad</li>

				  <li>nvc0: disable primitive restart and index bias during blits</li>

				  <li>nv50/ir: we can't load local memory directly into an output</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>nir/lower_vec_to_movs: Better report channels handled by insert_mov</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>mesa: Make glGet queries initialize ctx-&gt;Debug when necessary.</li>

				  <li>mesa: Allow Get*() of several forgotten IsEnabled() pnames.</li>

				  <li>i965: Only magnify depth for 3D textures, not array textures.</li>

				</ul>

				<p>Koop Mast (1):</p>

				<ul>

				  <li>st/clover: Add libelf cflags to the build</li>

				</ul>

				<p>Marc-André Lureau (1):</p>

				<ul>

				  <li>virtio_gpu: Add virtio 1.0 PCI ID to driver map</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>radeonsi: fix Hyper-Z on Stoney</li>

				  <li>gallium/radeon: don't use temporary buffers for persistent mappings</li>

				  <li>radeonsi: fix Hyper-Z hangs on P2 configs</li>

				</ul>

				<p>Matt Turner (3):</p>

				<ul>

				  <li>i965/vec4: don't copy ATTR into 3src instructions with complex swizzles</li>

				  <li>i965/fs: Don't CSE negated multiplies with saturation.</li>

				  <li>i965/vec4: Update vec4 unit tests for commit 01dacc83ff.</li>

				</ul>

				<p>Nanley Chery (2):</p>

				<ul>

				  <li>mesa/image: Make _mesa_clip_readpixels() work with renderbuffers</li>

				  <li>mesa/readpix: Clip ReadPixels() area to the ReadBuffer's</li>

				</ul>

				<p>Nicolai Hähnle (2):</p>

				<ul>

				  <li>r600g: clear compressed_depthtex/colortex_mask when binding buffer texture</li>

				  <li>st/mesa: use the texture view's format for render-to-texture</li>

				</ul>

				<p>Nishanth Peethambaran (2):</p>

				<ul>

				  <li>st/omx: Remove trailing spaces</li>

				  <li>st/omx/dec: Correct the timestamping</li>

				</ul>

				<p>Oded Gabbay (8):</p>

				<ul>

				  <li>gallium/radeon: Correctly translate colorswaps for big endian</li>

				  <li>llvmpipe: use vpkswss when dst is signed</li>

				  <li>gallium/radeon: return correct values for BE in r600_translate_colorswap</li>

				  <li>gallium/radeon: remove separate BE path in r600_translate_colorswap</li>

				  <li>gallium/r600: Don't let h/w do endian swap for colorformat</li>

				  <li>gallium/radeon: disable evergreen_do_fast_color_clear for BE</li>

				  <li>r600g: Do colorformat endian swap for PIPE_USAGE_STAGING</li>

				  <li>radeonsi: Do colorformat endian swap for PIPE_USAGE_STAGING</li>

				</ul>

				<p>Olivier Pena (1):</p>

				<ul>

				  <li>scons: support for LLVM 3.7.</li>

				</ul>

				<p>Patrick Baggett (1):</p>

				<ul>

				  <li>mesa: Use SSE prefetch instructions rather than 3DNow instructions</li>

				</ul>

				<p>Rob Herring (10):</p>

				<ul>

				  <li>Android: remove dependence on .SECONDEXPANSION</li>

				  <li>Android: glsl: fix dependence on YACC_HEADER_SUFFIX from build system</li>

				  <li>Android: add -Wno-date-time flag for clang</li>

				  <li>Android: remove headers from LOCAL_SRC_FILES</li>

				  <li>Android: clean-up and fix DRI module path handling</li>

				  <li>freedreno: drop unnecessary -Wno-packed-bitfield-compat</li>

				  <li>gallium/radeon: Add space between string literal and identifier</li>

				  <li>r600: Make enum alu_op_flags unsigned</li>

				  <li>virtio_gpu: Add PCI ID to driver map</li>

				  <li>Android: fix x86 gallium builds</li>

				</ul>

				<p>Roland Scheidegger (2):</p>

				<ul>

				  <li>softpipe: fix anisotropic filtering crash</li>

				  <li>draw: fix line stippling</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>nvc0: make sure to delete samplers used by compute shaders</li>

				</ul>

				<p>Steinar H. Gunderson (1):</p>

				<ul>

				  <li>mesa: Fix locking of GLsync objects.</li>

				</ul>

				<p>Tamil velan (1):</p>

				<ul>

				  <li>radeon/uvd: increase max height to 4096 for VI and newer</li>

				</ul>

				<p>Thomas Hellstrom (2):</p>

				<ul>

				  <li>winsys/svga: Fix an uninitialized return value</li>

				  <li>winsys/svga: Increase the fence timeout</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>llvmpipe: Do not use barriers if not using threads.</li>

				</ul>

				<p>xavier (1):</p>

				<ul>

				  <li>r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications.</li>

				</ul>

				</div>

				</body>

				</html>

									
										182

docs/relnotes/11.1.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,182 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.1.4 Release Notes / May 9, 2016</h1>

				<p>

				Mesa 11.1.4 is a bug fix release which fixes bugs found since the 11.1.3 release.

				</p>

				<p>

				Mesa 11.1.4 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				034231fffb22621dadb8e4a968cb44752b8b68db7a2417568d63c275b3490cea  mesa-11.1.4.tar.gz

				0f781e9072655305f576efd4204d183bf99ac8cb8d9e0dd9fc2b4093230a0eba  mesa-11.1.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92850">Bug 92850</a> - Segfault loading War Thunder</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94955">Bug 94955</a> - Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94994">Bug 94994</a> - OSMesaGetProcAdress always fails on mangled OSMesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95026">Bug 95026</a> - Alien Isolation segfault after initial loading screen/video</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95133">Bug 95133</a> - X-COM Enemy Within crashes when entering tactical mission with Bonaire</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>gallium/util: initialize pipe_framebuffer_state to zeros</li>

				</ul>

				<p>Chad Versace (1):</p>

				<ul>

				  <li>dri: Fix robust context creation via EGL attribute</li>

				</ul>

				<p>Egbert Eich (1):</p>

				<ul>

				  <li>dri2: Check for dummyContext to see if the glx_context is valid</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.1.3</li>

				  <li>cherry-ignore: add non-applicable "fix of a fix"</li>

				  <li>cherry-ignore: ignore st_DrawAtlasBitmaps mem leak fix</li>

				  <li>cherry-ignore: add CodeEmitterGK110::emitATOM() fix</li>

				  <li>Update version to 11.1.4</li>

				</ul>

				<p>Eric Anholt (4):</p>

				<ul>

				  <li>vc4: Fix subimage accesses to LT textures.</li>

				  <li>vc4: Add support for rendering to cube map surfaces.</li>

				  <li>vc4: Fix tests for format supported with nr_samples == 1.</li>

				  <li>vc4: Make sure we recompile when sample_mask changes.</li>

				</ul>

				<p>Frederic Devernay (1):</p>

				<ul>

				  <li>glapi: fix _glapi_get_proc_address() for mangled function names</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions</li>

				  <li>i965/tiled_memcpy: Rework the RGBA -&gt; BGRA mem_copy functions</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>egl/x11: authenticate before doing chipset id ioctls</li>

				</ul>

				<p>Jose Fonseca (1):</p>

				<ul>

				  <li>winsys/sw/xlib: use correct free function for xlib_dt-&gt;data</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/uvd: fix tonga feedback buffer size</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>drirc: add a workaround for blackness in Warsow</li>

				  <li>st/mesa: fix blit-based GetTexImage for non-finalized textures</li>

				</ul>

				<p>Nicolai Hähnle (5):</p>

				<ul>

				  <li>radeonsi: fix bounds check in si_create_vertex_elements</li>

				  <li>gallium/radeon: handle failure when mapping staging buffer</li>

				  <li>st/glsl_to_tgsi: reduce stack explosion in recursive expression visitor</li>

				  <li>gallium/radeon: fix crash in r600_set_streamout_targets</li>

				  <li>radeonsi: correct NULL-pointer check in si_upload_const_buffer</li>

				</ul>

				<p>Oded Gabbay (4):</p>

				<ul>

				  <li>r600g/radeonsi: send endian info to format translation functions</li>

				  <li>r600g: set endianess of 16/32-bit buffers according to do_endian_swap</li>

				  <li>r600g: use do_endian_swap in color swapping functions</li>

				  <li>r600g: use do_endian_swap in texture swapping function</li>

				</ul>

				<p>Roland Scheidegger (3):</p>

				<ul>

				  <li>llvmpipe: (trivial) initialize src1_alpha var to NULL</li>

				  <li>gallivm: fix bogus argument order to lp_build_sample_mipmap function</li>

				  <li>gallivm: make sampling more robust against bogus coordinates</li>

				</ul>

				<p>Samuel Pitoiset (5):</p>

				<ul>

				  <li>gk110/ir: make use of IMUL32I for all immediates</li>

				  <li>nvc0/ir: fix wrong emission of (a OP b) OP c</li>

				  <li>gk110/ir: add emission for (a OP b) OP c</li>

				  <li>nvc0: reduce GL_MAX_3D_TEXTURE_SIZE to 2048 on Kepler+</li>

				  <li>st/glsl_to_tgsi: fix potential crash when allocating temporaries</li>

				</ul>

				<p>Stefan Dirsch (1):</p>

				<ul>

				  <li>dri3: Check for dummyContext to see if the glx_context is valid</li>

				</ul>

				<p>Thomas Hindoe Paaboel Andersen (1):</p>

				<ul>

				  <li>st/va: avoid dereference after free in vlVaDestroyImage</li>

				</ul>

				<p>WuZhen (3):</p>

				<ul>

				  <li>tgsi: initialize stack allocated struct</li>

				  <li>winsys/sw/dri: use correct free function for dri_sw_dt-&gt;data</li>

				  <li>android: enable dlopen() on all architectures</li>

				</ul>

				</div>

				</body>

				</html>

									
										219

docs/relnotes/11.2.0.html
									
												View File
												
				@@ -14,7 +14,7 @@

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.2.0 Release Notes / TBD</h1>

				<h1>Mesa 11.2.0 Release Notes / 4 April 2016</h1>

				<p>

				Mesa 11.2.0 is a new development release.

				@@ -33,7 +33,8 @@ because compatibility contexts are not supported.

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				dea3d8143929aad5c24ef0993ddb05807b30c284b488fc62903adfcc1c127887  mesa-11.2.0.tar.gz

				1c1fed2674abf3f16ed2623e9a5694d6752c293194e18462ebc644a19cfaafb2  mesa-11.2.0.tar.xz

				</pre>

				@@ -70,7 +71,217 @@ Note: some of the new features are only available with certain drivers.

				<h2>Bug fixes</h2>

				TBD.

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=27512">Bug 27512</a> - Illegal instruction _mesa_x86_64_transform_points4_general</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75165">Bug 75165</a> - compute.c:464:49: error: function definition is not allowed here</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79783">Bug 79783</a> - Distorted output in obs-studio where other vendors &quot;work&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89330">Bug 89330</a> - piglit glsl-1.50 invariant-qualifier-in-out-block-01 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89969">Bug 89969</a> - nouveau: add support for chunk decoding in order to support vaapi (st/va)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90348">Bug 90348</a> - Spilling failure of b96 merged value</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91596">Bug 91596</a> - EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91806">Bug 91806</a> - configure does not test whether assembler supports sse4.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91927">Bug 91927</a> - [SKL] [regression] piglit compressed textures tests fail  with kernel upgrade</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92193">Bug 92193</a> - [SKL] ES2-CTS.gtf.GL2ExtensionTests.compressed_astc_texture.compressed_astc_texture fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92229">Bug 92229</a> - [APITRACE] SOMA have serious graphical errors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92233">Bug 92233</a> - Unigine Heaven 4.0 silhuette run</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92438">Bug 92438</a> - Segfault in pushbuf_kref when running the android emulator (qemu) on nv50</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92589">Bug 92589</a> - [BDW BSW SKL CTS] ES31-CTS.texture_gather.* GPU_HANG</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92595">Bug 92595</a> - [HSW,BDW,SKL][GLES 3.1 CTS] Big difference in the results for the ES31-CTS.shader_bitfield_operation.* tests</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92609">Bug 92609</a> - [BDW, BSW] piglit sampling-2d-array-as-2d-layer fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92687">Bug 92687</a> - Add support for ARB_internalformat_query2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92706">Bug 92706</a> - glBlitFramebuffer refuses to blit RGBA to RGB with MSAA</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92709">Bug 92709</a> - &quot;LLVM triggered Diagnostic Handler: unsupported call to function ldexpf in main&quot; when starting race in stuntrally</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92743">Bug 92743</a> - Centroid shouldn't have to match between the FS and the VS</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92759">Bug 92759</a> - [Regression, bisected] Visuals without alpha bits are not sRGB-capable</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92849">Bug 92849</a> - [IVB HSW BDW] piglit image load/store load-from-cleared-image.shader_test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92909">Bug 92909</a> - Offset/alignment issue with layout std140 and vec3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93004">Bug 93004</a> - Guild Wars 2 crash on nouveau DX11 cards</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93048">Bug 93048</a> - [CTS regression] mesa af2723 breaks GL Conformance for debug extension</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93063">Bug 93063</a> - drm_helper.h:227:1: error: static declaration of ‘pipe_virgl_create_screen’ follows non-static declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93091">Bug 93091</a> - [opencl] segfault when running any opencl programs (like clinfo)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93092">Bug 93092</a> - lp_test_format regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93126">Bug 93126</a> - wrongly claim supporting GL_EXT_texture_rg</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93180">Bug 93180</a> - [regression] arb_separate_shader_objects.active sampler conflict fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93189">Bug 93189</a> - &quot;./util/u_inlines.h&quot;, line 83: operands have incompatible types: void &quot;:&quot; int</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93215">Bug 93215</a> - [Regression bisected] Ogles1conform Automatic mipmap generation test is fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93235">Bug 93235</a> - [regression] dispatch sanity broken by GetPointerv</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93257">Bug 93257</a> - [SKL, bisected] ASTC dEQP tests segfault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93264">Bug 93264</a> - Tonga VM Faults since llvm ScheduleDAGInstrs: Rework schedule graph builder.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93266">Bug 93266</a> - gl_arb_shading_language_420pack does not allow binding of image variables</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93300">Bug 93300</a> - Two Worlds 2 renders water incorrectly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93312">Bug 93312</a> - [SKL][GLES 3.1 CTS] ES31-CTS.layout_binding* GPU_HANG</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93320">Bug 93320</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.vertex_attrib_binding.advanced-bindingUpdate fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93322">Bug 93322</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.compute_shader.resource-ubo fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93323">Bug 93323</a> - [HSW,BDW,SKL][GLES 3.1 CTS]ES31-CTS.shader_image_load_store.basic-allTargets-store-fs fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93325">Bug 93325</a> - [HSW,BDW,SKL]ES31-CTS.explicit_uniform_location.uniform-loc-* 2 tests fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93339">Bug 93339</a> - glLinkProgram() should fail when a varying is never written to in a previous stage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93348">Bug 93348</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.compute_shader.* segfault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93358">Bug 93358</a> - [HSW] Unreal Elemental demo - assertion error in copy_image_with_blitter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93387">Bug 93387</a> - inverse() shouldn’t be exposed in GLSL 1.20 and 1.30</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93388">Bug 93388</a> - [i965, regression, bisection] MESA_FORMAT_B8G8R8X8_SRGB changes break kwin</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93407">Bug 93407</a> - [SKL][GLES 3.1 CTS]ES31-CTS.compute_shader.resources-texture fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93410">Bug 93410</a> - [BDW,SKL][GLES 3.1 CTS]ES31-CTS.shader_image_load_store.negative-linkErrors fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93418">Bug 93418</a> - Geometry Shaders output wrong vertices on Sandy Bridge</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93426">Bug 93426</a> - [SKL,BDW,BSW,BXT] CTS regression: es2-cts.gtf.gl2fixedtests.buffer_objects.buffer_object,s</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93524">Bug 93524</a> - Clover doesn't build</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93526">Bug 93526</a> - GfxBench 4 tessellation demos misrender</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93532">Bug 93532</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.compute_shader.*. Regression, bisected.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93540">Bug 93540</a> - [BISECTED, HSW] Rendering issue in Heaven (and other benchmarks)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93560">Bug 93560</a> - opt_combine_constants failing fabsf(reg-&gt;f) == table.imm[i].val assertion</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93599">Bug 93599</a> - Strange green flashes with &quot;Metro: Last Light Redux&quot; + &quot;Metro 2033 Redux&quot; with Intel Mesa driver</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93648">Bug 93648</a> - Random lines being rendered when playing Dolphin (geometry shaders related, w/ apitrace)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93650">Bug 93650</a> - GL_ARB_separate_shader_objects is buggy (PCSX2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93667">Bug 93667</a> - Crash in eglCreateImageKHR with huge texture size</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93696">Bug 93696</a> - [HSW,BDW;SKL][GLES 3.1 CTS]ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-* fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93700">Bug 93700</a> - [SKL, regression] deqp-gles2.functional.texture.completeness</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93717">Bug 93717</a> - Meta mipmap generation can corrupt texture state</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93722">Bug 93722</a> - Segfault when compiling shader with a subroutine that takes a parameter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93725">Bug 93725</a> - [HSW, regression, bisected] ES31-CTS.texture_gather.*depth*</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93731">Bug 93731</a> - glUniformSubroutinesuiv segfaults when subroutine uniform is bound to a specific location</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93761">Bug 93761</a> - A conditional discard in a fragment shader causes no depth writing at all</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93790">Bug 93790</a> - [HSW] Use after free with compute programs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93792">Bug 93792</a> - [HSW] intel_mipmap_tree.c:1325: intel_miptree_copy_slice: Assertion `src_mt-&gt;format == dst_mt-&gt;format</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93813">Bug 93813</a> - Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93840">Bug 93840</a> - [i965] Alien: Isolation fails with GL_ARB_compute_shader enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93862">Bug 93862</a> - [Bisected] &quot;drm/amdgpu: fix amdgpu_bo_pin_restricted VRAM placing v2&quot; is bad</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93878">Bug 93878</a> - [llvmpipe][softpipe] piglit arb_gpu_shader_fp64-double-gettransformfeedbackvarying regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93957">Bug 93957</a> - [HSW] Mishandling of sample count when using an attachment-less framebuffer (assertion error)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93961">Bug 93961</a> - virgl build failure after 2016-02-01 changes - no previous prototype for 'virgl_drm_winsys_create'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93989">Bug 93989</a> - build: flex-2.5.39 seems to be failing for glsl_lexer.ll</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94016">Bug 94016</a> - make check MesaExtensionsTest.AlphabeticallySorted regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94019">Bug 94019</a> - [bisected] 3D acceleration broken with gallium/radeon: just get num_tile_pipes from the winsys</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94050">Bug 94050</a> - test_vec4_register_coalesce regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94073">Bug 94073</a> - Miscompilation of abs_vec3_vert_xvary_ref.vert in WebGL conformance</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94081">Bug 94081</a> - [HSW] compute shader shared var + atomic op = fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94088">Bug 94088</a> - [llvmpipe] SIGFPE pthread_barrier_destroy.c:40</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94091">Bug 94091</a> - Tonga unreal elemental segfault since radeonsi: put image, fmask, and sampler descriptors into one array</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94100">Bug 94100</a> - [HSW] compute indirect dispatch with 0 work groups causes gpu hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94134">Bug 94134</a> - [regression] piglit.spec.arb_texture_view.sampling-2d-array-as-2d-layer assertion</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94139">Bug 94139</a> - [regression, HSW, IVB] piglit.spec.arb_compute_shader.minmax</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94150">Bug 94150</a> - UE4 Suntemple rendering errors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94186">Bug 94186</a> - Crash when launching glxinfo and World of Warcraft with RV790</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94188">Bug 94188</a> - define (or undef) defined behaves stupidly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94193">Bug 94193</a> - [llvmpipe] Line antialiasing looks different when GL_LINE_STIPPLE is enabled with pattern 0xffff</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94199">Bug 94199</a> - Shader abort/crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94253">Bug 94253</a> - [llvmpipe] piglit gl-1.0-swapbuffers-behavior regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94254">Bug 94254</a> - [llvmpipe] [softpipe] piglit read-front regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94257">Bug 94257</a> - [softpipe] piglit glx-copy-sub-buffer regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94274">Bug 94274</a> - [swrast] piglit arb_occlusion_query2-render regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94284">Bug 94284</a> - [radeonsi] outlast segfault on start</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94388">Bug 94388</a> - r600_blit.c:281: r600_decompress_depth_textures: Assertion `tex-&gt;is_depth &amp;&amp; !tex-&gt;is_flushing_texture' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94412">Bug 94412</a> - Trine 3 misrender</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94481">Bug 94481</a> - softpipe - access violation in img_filter_2d_nearest</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94524">Bug 94524</a> - Wrong gl_TessLevelOuter interpretation for isolines</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94595">Bug 94595</a> - [Mesa AMD&amp;swrast] Texture views attached as framebuffers return their viewed tecture's color encoding and render incorrectly</li>

				</ul>

				<h2>Changes</h2>

				@@ -78,7 +289,7 @@ Microsoft Visual Studio 2013 or later is now required for building

				on Windows.

				Previously, Visual Studio 2008 and later were supported.

				TBD.

				</div>

				</body>

									
										119

docs/relnotes/11.2.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,119 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.2.1 Release Notes / April 17, 2016</h1>

				<p>

				Mesa 11.2.1 is a bug fix release which fixes bugs found since the 11.2.0 release.

				</p>

				<p>

				Mesa 11.2.1 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				cc2a024204564a71acc95cf262bf618fe49b1d77d351e5755eea705cadac5167  mesa-11.2.1.tar.gz

				a65207e9ae5c5f1c29f863c6a2cc98a7ab99762a24b82a248337f0ea9cfce01b  mesa-11.2.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (2):</p>

				<ul>

				  <li>st/mesa: fix glReadBuffer() assertion failure</li>

				  <li>st/mesa: fix memleak in glDrawPixels cache code</li>

				</ul>

				<p>Christian Schmidbauer (1):</p>

				<ul>

				  <li>st/nine: specify WINAPI only for i386 and amd64</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.2.0</li>

				  <li>configure.ac: update the path of the generated files</li>

				  <li>Update version to 11.2.1</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>glsl: allow usage of the keyword buffer before GLSL 430 / ESSL 310</li>

				</ul>

				<p>Iurie Salomov (1):</p>

				<ul>

				  <li>va: check null context in vlVaDestroyContext</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions</li>

				  <li>i965/tiled_memcpy: Rework the RGBA -&gt; BGRA mem_copy functions</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>i965: Fix textureSize() depth value for 1 layer surfaces on Gen4-6.</li>

				  <li>i965: Use brw-&gt;urb.min_vs_urb_entries instead of 32 for BLORP.</li>

				  <li>glsl: Lower variable indexing of system value arrays unconditionally.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>drirc: add a workaround for blackness in Warsow</li>

				</ul>

				<p>Nicolai Hähnle (1):</p>

				<ul>

				  <li>radeonsi: fix bounds check in si_create_vertex_elements</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>nv50/ir: do not try to attach JOIN ops to ATOM</li>

				</ul>

				<p>Thomas Hindoe Paaboel Andersen (1):</p>

				<ul>

				  <li>st/va: avoid dereference after free in vlVaDestroyImage</li>

				</ul>

				</div>

				</body>

				</html>

									
										210

docs/relnotes/11.2.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,210 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.2.2 Release Notes / May 9, 2016</h1>

				<p>

				Mesa 11.2.2 is a bug fix release which fixes bugs found since the 11.2.1 release.

				</p>

				<p>

				Mesa 11.2.2 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e2453014cd2cc5337a5180cdeffe8cf24fffbb83e20a96888e2b01df868eaae6  mesa-11.2.2.tar.gz

				40e148812388ec7c6d7b6657d5a16e2e8dabba8b97ddfceea5197947647bdfb4  mesa-11.2.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92850">Bug 92850</a> - Segfault loading War Thunder</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93767">Bug 93767</a> - Glitches with soft shadows and MSAA in Knights of the Old Republic 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94955">Bug 94955</a> - Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94994">Bug 94994</a> - OSMesaGetProcAdress always fails on mangled OSMesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95026">Bug 95026</a> - Alien Isolation segfault after initial loading screen/video</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95133">Bug 95133</a> - X-COM Enemy Within crashes when entering tactical mission with Bonaire</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95164">Bug 95164</a> - GLSL compiler (linker I think) emits assertion upon call to glAttachShader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95251">Bug 95251</a> - vdpau decoder capabilities: not supported</li>

				</ul>

				<h2>Changes</h2>

				<p>Boyuan Zhang (1):</p>

				<ul>

				  <li>radeon/uvd: alignment fix for decode message buffer</li>

				</ul>

				<p>Brian Paul (2):</p>

				<ul>

				  <li>st/mesa: fix sampler view leak in st_DrawAtlasBitmaps()</li>

				  <li>gallium/util: initialize pipe_framebuffer_state to zeros</li>

				</ul>

				<p>Chad Versace (1):</p>

				<ul>

				  <li>dri: Fix robust context creation via EGL attribute</li>

				</ul>

				<p>Egbert Eich (1):</p>

				<ul>

				  <li>dri2: Check for dummyContext to see if the glx_context is valid</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.2.1</li>

				  <li>docs: update the sha256 checksums for 11.2.1</li>

				  <li>cherry-ignore: remove duplicate commit</li>

				  <li>cherry-ignore: ignore the GetSamplerParameterIuiv{EXT,OES} fixups</li>

				  <li>Update version to 11.2.2</li>

				</ul>

				<p>Eric Anholt (4):</p>

				<ul>

				  <li>vc4: Fix subimage accesses to LT textures.</li>

				  <li>vc4: Add support for rendering to cube map surfaces.</li>

				  <li>vc4: Fix tests for format supported with nr_samples == 1.</li>

				  <li>vc4: Make sure we recompile when sample_mask changes.</li>

				</ul>

				<p>Frederic Devernay (1):</p>

				<ul>

				  <li>glapi: fix _glapi_get_proc_address() for mangled function names</li>

				</ul>

				<p>Ilia Mirkin (2):</p>

				<ul>

				  <li>nvc0: fix retrieving query results into buffer for timestamps</li>

				  <li>nouveau/video: properly detect the decoder class for availability checks</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965/fs: Properly report regs_written from SAMPLEINFO</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>egl/x11: authenticate before doing chipset id ioctls</li>

				</ul>

				<p>Jose Fonseca (1):</p>

				<ul>

				  <li>winsys/sw/xlib: use correct free function for xlib_dt-&gt;data</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>i965: Fix clear code for ignoring colormask for XRGB formats on Gen9+.</li>

				  <li>glsl: Convert lower_vec_index_to_swizzle to a rvalue visitor.</li>

				  <li>glsl: Lower vector_extracts to swizzles after lower_vector_derefs.</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/uvd: fix tonga feedback buffer size</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>st/mesa: fix blit-based GetTexImage for non-finalized textures</li>

				</ul>

				<p>Nicolai Hähnle (5):</p>

				<ul>

				  <li>gallium/radeon: handle failure when mapping staging buffer</li>

				  <li>st/glsl_to_tgsi: reduce stack explosion in recursive expression visitor</li>

				  <li>gallium/radeon: fix crash in r600_set_streamout_targets</li>

				  <li>radeonsi: correct NULL-pointer check in si_upload_const_buffer</li>

				  <li>radeonsi: work around an MSAA fast stencil clear problem</li>

				</ul>

				<p>Oded Gabbay (4):</p>

				<ul>

				  <li>r600g/radeonsi: send endian info to format translation functions</li>

				  <li>r600g: set endianess of 16/32-bit buffers according to do_endian_swap</li>

				  <li>r600g: use do_endian_swap in color swapping functions</li>

				  <li>r600g: use do_endian_swap in texture swapping function</li>

				</ul>

				<p>Patrick Rudolph (1):</p>

				<ul>

				  <li>r600g: fix and optimize tgsi_cmp when using ABS and NEG modifier</li>

				</ul>

				<p>Roland Scheidegger (3):</p>

				<ul>

				  <li>llvmpipe: (trivial) initialize src1_alpha var to NULL</li>

				  <li>gallivm: fix bogus argument order to lp_build_sample_mipmap function</li>

				  <li>gallivm: make sampling more robust against bogus coordinates</li>

				</ul>

				<p>Samuel Pitoiset (6):</p>

				<ul>

				  <li>gk110/ir: do not overwrite def value with zero for EXCH ops</li>

				  <li>gk110/ir: make use of IMUL32I for all immediates</li>

				  <li>nvc0/ir: fix wrong emission of (a OP b) OP c</li>

				  <li>gk110/ir: add emission for (a OP b) OP c</li>

				  <li>nvc0: reduce GL_MAX_3D_TEXTURE_SIZE to 2048 on Kepler+</li>

				  <li>st/glsl_to_tgsi: fix potential crash when allocating temporaries</li>

				</ul>

				<p>Stefan Dirsch (1):</p>

				<ul>

				  <li>dri3: Check for dummyContext to see if the glx_context is valid</li>

				</ul>

				<p>Topi Pohjolainen (2):</p>

				<ul>

				  <li>i965/blorp/gen7: Prepare re-using for gen8</li>

				  <li>i965/blorp: Use 8k chunk size for urb allocation</li>

				</ul>

				<p>WuZhen (3):</p>

				<ul>

				  <li>tgsi: initialize stack allocated struct</li>

				  <li>winsys/sw/dri: use correct free function for dri_sw_dt-&gt;data</li>

				  <li>android: enable dlopen() on all architectures</li>

				</ul>

				</div>

				</body>

				</html>

									
										335

docs/relnotes/12.0.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,335 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.0 Release Notes / July 8, 2016</h1>

				<p>

				Mesa 12.0.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 12.0.1.

				</p>

				<p>

				Mesa 12.0.0 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				3b8fa4d86d78f8f6ec86055b92ad1afe869001483593b3dd4531184b8bc4fcfb  mesa-12.0.0.tar.gz

				0090c025219318935124292b482e3439bc43e8c074ad01086449fcad88547dc6  mesa-12.0.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>OpenGL 4.3 on nvc0, radeonsi, i965 (Gen8+)</li>

				<li>OpenGL ES 3.1 on nvc0, radeonsi</li>

				<li>GL_ARB_ES3_1_compatibility on nvc0, radeonsi</li>

				<li>GL_ARB_compute_shader on nvc0, radeonsi, softpipe</li>

				<li>GL_ARB_cull_distance on i965/gen6+, nv50, nvc0, llvmpipe, softpipe</li>

				<li>GL_ARB_framebuffer_no_attachments on nvc0, r600, radeonsi, softpipe</li>

				<li>GL_ARB_internalformat_query2 on all drivers</li>

				<li>GL_ARB_query_buffer_object on i965/hsw+</li>

				<li>GL_ARB_robust_buffer_access_behavior on i965, nvc0, radeonsi</li>

				<li>GL_ARB_shader_atomic_counters on radeonsi, softpipe</li>

				<li>GL_ARB_shader_atomic_counter_ops on nvc0, radeonsi, softpipe</li>

				<li>GL_ARB_shader_image_load_store on nvc0, radeonsi, softpipe</li>

				<li>GL_ARB_shader_image_size on nvc0, radeonsi, softpipe</li>

				<li>GL_ARB_shader_storage_buffer_objects on radeonsi, softpipe</li>

				<li>GL_ATI_fragment_shader on all Gallium drivers</li>

				<li>GL_EXT_base_instance on all drivers that support GL_ARB_base_instance</li>

				<li>GL_EXT_clip_cull_distance on all drivers that support GL_ARB_cull_distance</li>

				<li>GL_KHR_robustness on i965</li>

				<li>GL_OES_copy_image on i965 (Baytrail and Gen8+)</li>

				<li>GL_OES_draw_buffers_indexed and GL_EXT_draw_buffers_indexed on all drivers that support GL_ARB_draw_buffers_blend</li>

				<li>GL_OES_gpu_shader5 and GL_EXT_gpu_shader5 on all drivers that support GL_ARB_gpu_shader5</li>

				<li>GL_OES_sample_shading on i965, nvc0, r600, radeonsi</li>

				<li>GL_OES_sample_variables on i965, nvc0, r600, radeonsi</li>

				<li>GL_OES_shader_image_atomic on all drivers that support GL_ARB_shader_image_load_store</li>

				<li>GL_OES_shader_io_blocks on i965, nvc0, radeonsi</li>

				<li>GL_OES_shader_multisample_interpolation on i965, nvc0, r600, radeonsi</li>

				<li>GL_OES_texture_border_clamp and GL_EXT_texture_border_clamp on all drivers that support GL_ARB_texture_border_clamp</li>

				<li>GL_OES_texture_buffer and GL_EXT_texture_buffer on i965, nvc0, radeonsi</li>

				<li>EGL_KHR_reusable_sync on all drivers</li>

				<li>GL_ARB_stencil_texture8 and GL_OES_stencil_texture8 on i965/gen8+</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=42187">Bug 42187</a> - ES 1.1 conformance pntszary.c fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71789">Bug 71789</a> - [r300g] Visuals not found in (default) depth = 24</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81585">Bug 81585</a> - piglit spec_glsl-1.10_compiler_literals_invalid-float-suffix-capital-f.vert fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83036">Bug 83036</a> - [ILK]Piglit spec_ARB_copy_image_arb_copy_image-formats fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89607">Bug 89607</a> - Assertion hit in opt_array_splitting with recursive array indexing</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90513">Bug 90513</a> - Odd gray and red flicker in The Talos Principle on GK104</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92628">Bug 92628</a> - HTTP site for Mesa downloads</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92743">Bug 92743</a> - Centroid shouldn't have to match between the FS and the VS</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92850">Bug 92850</a> - Segfault loading War Thunder</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93054">Bug 93054</a> - [BDW] DiRT Showdown and Bioshock Infinite only render half the screen (bottom left triangle)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93524">Bug 93524</a> - Clover doesn't build</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93551">Bug 93551</a> - Divinity: Original Sin Enhanced Edition(Native) crash on start</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93667">Bug 93667</a> - Crash in eglCreateImageKHR with huge texture size</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93767">Bug 93767</a> - Glitches with soft shadows and MSAA in Knights of the Old Republic 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93840">Bug 93840</a> - [i965] Alien: Isolation fails with GL_ARB_compute_shader enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94081">Bug 94081</a> - [HSW] compute shader shared var + atomic op = fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94086">Bug 94086</a> - Multiple conflicting libGL libraries installed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94116">Bug 94116</a> - program interface queries not returning right data for UBO / GL_BLOCK_INDEX</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94129">Bug 94129</a> - Mesa's compiler should warn about undefined values</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94181">Bug 94181</a> - [regression] piglit.spec.ext_framebuffer_object.getteximage-formats init-by-clear-and-render</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94193">Bug 94193</a> - [llvmpipe] Line antialiasing looks different when GL_LINE_STIPPLE is enabled with pattern 0xffff</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94198">Bug 94198</a> - [HSW] segfault in copy image when copying from cubemap to 2d</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94199">Bug 94199</a> - Shader abort/crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94253">Bug 94253</a> - [llvmpipe] piglit gl-1.0-swapbuffers-behavior regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94254">Bug 94254</a> - [llvmpipe] [softpipe] piglit read-front regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94257">Bug 94257</a> - [softpipe] piglit glx-copy-sub-buffer regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94274">Bug 94274</a> - [swrast] piglit arb_occlusion_query2-render regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94284">Bug 94284</a> - [radeonsi] outlast segfault on start</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94291">Bug 94291</a> - llvmpipe tests fail if built on skylake i7-6700k</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94348">Bug 94348</a> - vkBindImageMemory doesn't take into account the offset when the image is used as a depth buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94383">Bug 94383</a> - build error on i386 when enabling swr</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94388">Bug 94388</a> - r600_blit.c:281: r600_decompress_depth_textures: Assertion `tex-&gt;is_depth &amp;&amp; !tex-&gt;is_flushing_texture' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94412">Bug 94412</a> - Trine 3 misrender</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94447">Bug 94447</a> - glsl/glcpp/tests/glcpp-test-cr-lf regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94453">Bug 94453</a> - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_{center,corner} fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94454">Bug 94454</a> - dEQP-GLES3.functional.clipping.point.wide_point_clip* fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94456">Bug 94456</a> - dEQP-GLES3.functional.state_query.floats.{blend_color,color_clear_value,depth_clear_value}_getinteger64 fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94458">Bug 94458</a> - dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initial fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94468">Bug 94468</a> - [HSW, regression, bisected] numerous Sascha demos render incorrectly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94481">Bug 94481</a> - softpipe - access violation in img_filter_2d_nearest</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94485">Bug 94485</a> - dEQP-GLES3.functional.negative_api.shader.compile_shader and delete_shader broken by Meta</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94524">Bug 94524</a> - Wrong gl_TessLevelOuter interpretation for isolines</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94595">Bug 94595</a> - [Mesa AMD&amp;swrast] Texture views attached as framebuffers return their viewed tecture's color encoding and render incorrectly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94657">Bug 94657</a> - [llvmpipe] [softpipe] piglit arb_texture_view-getteximage-srgb regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94661">Bug 94661</a> - [bdw, skl] vk-cts: new test failing</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94671">Bug 94671</a> - [radeonsi] Blue-ish textures in Shadow of Mordor</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94713">Bug 94713</a> - [Gen8+] ES 3.1 Stencil texturing broken for 2DArray/Cubes</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94747">Bug 94747</a> - Convert phi nodes to logical operations</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94835">Bug 94835</a> - Increase fragment shader sample limits from 16 to 32 (AMD Linux - Mesa/RadeonSi)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94847">Bug 94847</a> - [ES3.1CTS] es31-cts.draw_buffers_indexed.color_masks fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94896">Bug 94896</a> - [vulkan] new CTS tests fail on i965</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94904">Bug 94904</a> - [vulkan, BSW] dEQP-VK.api.object_management.multithreaded_per_thread_device intermittent crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94907">Bug 94907</a> - codegen/nv50_ir_ra.cpp:1330:29: error: ‘isinf’ was not declared in this scope</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94909">Bug 94909</a> - [llvmpipe] piglit fs-roundEven-float regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94917">Bug 94917</a> - radeonsi supports GL_ARB_shader_storage_buffer_object with 0 GL_MAX_COMBINED_SHADER_STORAGE_BLOCKS</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94924">Bug 94924</a> - [GEN8] Ungine Valley fails to run due to &quot;intel_do_flush_locked failed: Input/output error&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94925">Bug 94925</a> - Crash in egl_dri3_get_dri_context with Dolphin EGL/X11 in single-core mode</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94944">Bug 94944</a> - [regression, hswgt1] gpu hang on arb_shader_image_load_store</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94955">Bug 94955</a> - Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94969">Bug 94969</a> - build fails because install-data-local doesn't follow $DESTDIR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94972">Bug 94972</a> - blend failures on llvmpipe with llvm 3.7 due to vector selects</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94979">Bug 94979</a> - dolphin-emu rendering broken on gallium/SWR + crashing often</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94984">Bug 94984</a> - XCom2 crashes with SIGSEGV on radeonsi</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94994">Bug 94994</a> - OSMesaGetProcAdress always fails on mangled OSMesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94997">Bug 94997</a> - [vulkan, SKL,BDW,HSW] deqp-vk.spirv_assembly.instruction.compute.opcopymemory.array regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94998">Bug 94998</a> - [vulkan] deqp-vk.pipeline.push_constant.graphics_pipeline.count_3shader_vgf regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95001">Bug 95001</a> - [vulkan] deqp-vk.binding_model.shader_access regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95005">Bug 95005</a> - Unreal engine demos segfault after shader compilation error with OpenGL 4.3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95026">Bug 95026</a> - Alien Isolation segfault after initial loading screen/video</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95034">Bug 95034</a> - vkResetCommandPool should not destroy the command buffers.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95071">Bug 95071</a> - [bisected] Wrong colors in KDE/Qt applications</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95133">Bug 95133</a> - X-COM Enemy Within crashes when entering tactical mission with Bonaire</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95138">Bug 95138</a> - [deqp, 32bit, gen8+] deqp-gles31.functional.draw_indirect.negative</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95142">Bug 95142</a> - [ES3.1CTS,GEN8] ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos assertion</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95158">Bug 95158</a> - glx-test compilation fails in `make check`</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95164">Bug 95164</a> - GLSL compiler (linker I think) emits assertion upon call to glAttachShader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95180">Bug 95180</a> - rasterizer/memory/Convert.h:170:9: error: ‘__builtin_isnan’ is not a member of ‘std’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95198">Bug 95198</a> - Shadow of Mordor beta has missing geometry with gl 4.3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95203">Bug 95203</a> - Tonga GST/OMX/VCE encode broken since mesa: st/omx: Fix resource leak on OMX_ErrorNone</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95211">Bug 95211</a> - scons TypeError: 'tuple' object is not callable</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95246">Bug 95246</a> - Segfault in glBindFramebuffer()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95251">Bug 95251</a> - vdpau decoder capabilities: not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95252">Bug 95252</a> - [deqp] deqp-gles31.functional.debug.object_labels.query_length_only crashes</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95292">Bug 95292</a> - [IVB,SKL] vulkan: stride/tiling issue with vkCmdCopyBufferToImage from larger source buffer into destination image</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95296">Bug 95296</a> - nir_lower_double_packing.c:79:4: error: void function 'lower_double_pack_impl' should not return a value [-Wreturn-type]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95324">Bug 95324</a> - GL33-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo fails in one case on Haswell</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95370">Bug 95370</a> - [965GM] piglit fails many tests after a5d7e144</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95373">Bug 95373</a> - Suspicious warning in brw_blorp_clear.cpp</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95403">Bug 95403</a> - [GK110] misaligned_gpr spamming dmesg when playing victor vran</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95419">Bug 95419</a> - [HSW][regression][bisect] RPG Maker game gives &quot;invalid floating point operation&quot; at startup</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95456">Bug 95456</a> - glXGetFBConfigs has invalid screen bounds</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95462">Bug 95462</a> - [BXT,BSW] arb_gpu_shader_fp64 causes gpu hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95529">Bug 95529</a> - [regression, bisected] Image corruption in Chrome</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95537">Bug 95537</a> - Invalid argument  in anv_ioctl called from anv_physical_device_init</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96221">Bug 96221</a> - nir/nir_lower_tex.c:202: error: unknown field ‘f32’ specified in initializer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96228">Bug 96228</a> - SSBO test regressions from mesa 5b267509</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96236">Bug 96236</a> - dri_interface.h:404: error: redefinition of typedef ‘mesa_glinterop_device_info’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96238">Bug 96238</a> - swr fails to build outside of the main directory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96239">Bug 96239</a> - [radeonsi tessellation] [R9 290/390] Random &quot;texture flickering&quot; (Shadow of Mordor, Tomb Raider, Unigine Heaven 4.0)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96258">Bug 96258</a> - [NVC0] Hang when running compute program</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96285">Bug 96285</a> - Mesa build broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96299">Bug 96299</a> - [vulkan] 64 regressions due to mesa d5f2f32</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96346">Bug 96346</a> - [SNB,CTS] es2-cts.gtf.gl.atan regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96349">Bug 96349</a> - [CTS,SKL,BSW,BDW,KBL,BXT] es31-cts.arrays_of_arrays.interactionuniformbuffers3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96351">Bug 96351</a> - [CTS,SKL,KBL,BXT] es2-cts.gtf.gl2extensiontests.egl_image.egl_image</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96425">Bug 96425</a> - [bisected] occasional dark render in The Talos Principle</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96504">Bug 96504</a> - [vulkancts] compute tests crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96516">Bug 96516</a> - [bisected: 482526] &quot;clover: Update OpenCL version string to match OpenGL&quot;: clover's build fails because of missing git_sha1.h</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96565">Bug 96565</a> - Clive Barker's Jericho displays strange,vivid colors when motion blur enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96607">Bug 96607</a> - [bisected] texture misrender / flicker in The Talos Principle on SKL</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96617">Bug 96617</a> - gl_SecondaryFragDataEXT doesn't work for extended blend func</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96629">Bug 96629</a> - dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0: Assertion `width &gt;= 1' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96639">Bug 96639</a> - st/mesa: transfer_map with too-high level with dEQP-GLES2.functional.texture.completeness.cube.extra_level</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96674">Bug 96674</a> - [SNB, ILK] spec.ext_image_dma_buf_import.ext_image_dma_buf_import-sample_nv1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96765">Bug 96765</a> - BindFragDataLocationIndexed on array fragment shader output.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96791">Bug 96791</a> - Cannot use image from swapchains for sampling</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96825">Bug 96825</a> - anv_device.c:31:27: fatal error: anv_timestamp.h: No such file or directory</li>

				</ul>

				<h2>Changes</h2>

				Radeon drivers (r600 and radeonsi) now require LLVm 3.6 as a minimum.

				</div>

				</body>

				</html>

									
										67

docs/relnotes/12.0.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,67 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>

				<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>

				<p>

				Mesa 12.0.1 is a bug fix release which fixes bugs found since the 12.0.1 release.

				</p>

				<p>

				Mesa 12.0.1 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				28dff9c045f4305c96a875a487b9f06c7e88d910511cd6016dbddcd1f53ade0d  mesa-12.0.1.tar.gz

				bab24fb79f78c876073527f515ed871fc9c81d816f66c8a0b051d8d653896389  mesa-12.0.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96864">Bug 96864</a> - Mesa 12.0 radeon build broken</li>

				</ul>

				<h2>Changes</h2>

				<p>Emil Velikov (4):</p>

				<ul>

				  <li>docs: add sha256 checksums for 12.0.0</li>

				  <li>radeon: reference the correct cdw/max_dw</li>

				  <li>Update version to 12.0.1</li>

				  <li>docs: add release notes for 12.0.1</li>

				</ul>

				</div>

				</body>

				</html>

									
										403

docs/relnotes/12.0.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,403 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.2 Release Notes / September 2, 2016</h1>

				<p>

				Mesa 12.0.2 is a bug fix release which fixes bugs found since the 12.0.1 release.

				</p>

				<p>

				Mesa 12.0.2 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				a08565ab1273751ebe2ffa928cbf785056594c803077c9719d0763da780f2918  mesa-12.0.2.tar.gz

				d957a5cc371dcd7ff2aa0d87492f263aece46f79352f4520039b58b1f32552cb  mesa-12.0.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69622">Bug 69622</a> - eglTerminate then eglMakeCurrent crahes</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89599">Bug 89599</a> - symbol 'x86_64_entry_start' is already defined when building with LLVM/clang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92306">Bug 92306</a> - GL Excess demo renders incorrectly on nv43</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94148">Bug 94148</a> - Framebuffer considered invalid when a draw call is done before glCheckFramebufferStatus</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96274">Bug 96274</a> - [NVC0] Failure when compiling compute shader: Assertion `bb-&gt;getFirst()-&gt;serial &lt;= bb-&gt;getExit()-&gt;serial' failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96381">Bug 96381</a> - Texture artifacts with immutable texture storage and mipmaps</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96762">Bug 96762</a> - [radeonsi,apitrace] Firewatch: nothing rendered in scrollable (text) areas</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96835">Bug 96835</a> - &quot;gallium: Force blend color to 16-byte alignment&quot; crash with &quot;-march=native -O3&quot; causes some 32bit games to crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96850">Bug 96850</a> - Crucible tests fail for 32bit mesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96908">Bug 96908</a> - [radeonsi] MSAA causes graphical artifacts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96911">Bug 96911</a> - webgl2 conformance2/textures/misc/tex-mipmap-levels.html crashes 12.1 Intel driver</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96971">Bug 96971</a> - invariant qualifier is not valid for shader inputs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97039">Bug 97039</a> - The Talos Principle and Serious Sam 3 GPU faults</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97207">Bug 97207</a> - [IVY BRIDGE] Fragment shader discard writing to depth</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97214">Bug 97214</a> - X not running with error &quot;Failed to make EGL context current&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97225">Bug 97225</a> - [i965 on HD4600 Haswell] xcom switch to ingame cinematics cause segmentation fault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97307">Bug 97307</a> - glsl/glcpp/tests/glcpp-test regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97331">Bug 97331</a> - glDrawElementsBaseVertex doesn't work in display list on i915</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97351">Bug 97351</a> - DrawElementsBaseVertex with VBO ignores base vertex on Intel GMA 9xx in some cases</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97426">Bug 97426</a> - glScissor gives vertically inverted result</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97476">Bug 97476</a> - Shader binaries should not be stored in the PipelineCache</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97567">Bug 97567</a> - [SNB, ILK] ctl, piglit regressions in mesa 12.0.2rc1</li>

				</ul>

				<h2>Changes</h2>

				<p>Andreas Boll (1):</p>

				<ul>

				  <li>configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too</li>

				</ul>

				<p>Bernard Kilarski (1):</p>

				<ul>

				  <li>glx: fix error code when there is no context bound</li>

				</ul>

				<p>Brian Paul (4):</p>

				<ul>

				  <li>svga: handle mismatched number of samplers, sampler views</li>

				  <li>mesa: use _mesa_clear_texture_image() in clear_texture_fields()</li>

				  <li>swrast: fix incorrectly positioned putImage() in swrast driver</li>

				  <li>mesa: fix format conversion bug in get_tex_rgba_uncompressed()</li>

				</ul>

				<p>Chad Versace (2):</p>

				<ul>

				  <li>i965: Fix miptree layout for EGLImage-based renderbuffers</li>

				  <li>i965: Respect miptree offsets in intel_readpixels_tiled_memcpy()</li>

				</ul>

				<p>Christian König (1):</p>

				<ul>

				  <li>st/mesa: fix reference counting bug in st_vdpau</li>

				</ul>

				<p>Chuck Atkins (1):</p>

				<ul>

				  <li>swr: Refactor checks for compiler feature flags</li>

				</ul>

				<p>Daniel Scharrer (1):</p>

				<ul>

				  <li>mesa: Fix fixed function spot lighting on newer hardware (again)</li>

				</ul>

				<p>Dave Airlie (2):</p>

				<ul>

				  <li>anv: fix writemask on blit fragment shader.</li>

				  <li>st/glsl_to_tgsi: fix st_src_reg_for_double constant.</li>

				</ul>

				<p>Emil Velikov (15):</p>

				<ul>

				  <li>docs: add sha256 checksums for 12.0.1</li>

				  <li>mesa: automake: list builddir before srcdir</li>

				  <li>mesa: scons: list builddir before srcdir</li>

				  <li>i965: store reference to the context within struct brw_fence (v2)</li>

				  <li>anv: remove internal 'validate' layer</li>

				  <li>anv: automake: use VISIBILITY_CFLAGS to restrict symbol visibility</li>

				  <li>anv: automake: build with -Bsymbolic</li>

				  <li>anv: do not export the Vulkan API</li>

				  <li>anv: remove dummy VK_DEBUG_MARKER_EXT entry points</li>

				  <li>isl: automake: use VISIBILITY_CFLAGS to restrict symbol visibility</li>

				  <li>cherry-ignore: temporary(?) drop "a4xx: make sure to actually clamp depth"</li>

				  <li>i915: Check return value of screen-&gt;image.loader-&gt;getBuffers</li>

				  <li>Revert "i965/miptree: Set logical_depth0 == 6 for cube maps"</li>

				  <li>glx/glvnd: list the strcmp arguments in correct order</li>

				  <li>Update version to 12.0.2</li>

				</ul>

				<p>Eric Anholt (4):</p>

				<ul>

				  <li>vc4: Close our screen's fd on screen close.</li>

				  <li>vc4: Disable early Z with computed depth.</li>

				  <li>vc4: Fix a leak of the src[] array of VPM reads in optimization.</li>

				  <li>vc4: Fix leak of the bo_handles table.</li>

				</ul>

				<p>Francisco Jerez (3):</p>

				<ul>

				  <li>i965: Emit SKL VF cache invalidation W/A from brw_emit_pipe_control_flush.</li>

				  <li>i965: Make room in the batch epilogue for three more pipe controls.</li>

				  <li>i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush.</li>

				</ul>

				<p>Haixia Shi (1):</p>

				<ul>

				  <li>platform_android: prevent deadlock in droid_swap_buffers</li>

				</ul>

				<p>Ian Romanick (5):</p>

				<ul>

				  <li>mesa: Strip arrayness from interface block names in some IO validation</li>

				  <li>glsl: Pack integer and double varyings as flat even if interpolation mode is none</li>

				  <li>glcpp: Track the actual version instead of just the version_resolved flag</li>

				  <li>glcpp: Only disallow #undef of pre-defined macros on GLSL ES &gt;= 3.00 shaders</li>

				  <li>glsl: Mark cube map array sampler types as reserved in GLSL ES 3.10</li>

				</ul>

				<p>Ilia Mirkin (16):</p>

				<ul>

				  <li>mesa: etc2 online compression is unsupported, don't attempt it</li>

				  <li>st/mesa: return appropriate mesa format for ETC texture formats</li>

				  <li>mesa: set _NEW_BUFFERS when updating texture bound to current buffers</li>

				  <li>nv50,nvc0: srgb rendering is only available for rgba/bgra</li>

				  <li>vbo: allow DrawElementsBaseVertex in display lists</li>

				  <li>gallium/util: add helper to compute zmin/zmax for a viewport state</li>

				  <li>nv50,nvc0: fix depth range when halfz is enabled</li>

				  <li>nv50/ir: fix bb positions after exit instructions</li>

				  <li>vbo: add basevertex when looking up elements for vbo splitting</li>

				  <li>a4xx: only disable depth clipping, not all clipping, when requested</li>

				  <li>nv50/ir: make sure cfg iterator always hits all blocks</li>

				  <li>main: add missing EXTRA_END in OES_sample_variables get check</li>

				  <li>nouveau: always enable at least one RC</li>

				  <li>nv30: only bail on color/depth bpp mismatch when surfaces are swizzled</li>

				  <li>a4xx: make sure to actually clamp depth as requested</li>

				  <li>gk110/ir: fix quadop dall emission</li>

				</ul>

				<p>Jan Ziak (2):</p>

				<ul>

				  <li>egl/x11: avoid using freed memory if dri2 init fails</li>

				  <li>loader: fix memory leak in loader_dri3_open</li>

				</ul>

				<p>Jason Ekstrand (31):</p>

				<ul>

				  <li>nir/spirv: Don't multiply the push constant block size by 4</li>

				  <li>anv: Add a stub for CmdCopyQueryPoolResults on Ivy Bridge</li>

				  <li>glsl/types: Fix function type comparison function</li>

				  <li>glsl/types: Use _mesa_hash_data for hashing function types</li>

				  <li>genxml: Make gen6-7 blending look more like gen8</li>

				  <li>anv/pipeline: Unify blend state setup between gen7 and gen8</li>

				  <li>anv: Enable independentBlend on gen7</li>

				  <li>anv: Add an align_down_npot_u32 helper</li>

				  <li>anv: Handle VK_WHOLE_SIZE properly for buffer views</li>

				  <li>i965/miptree: Enforce that height == 1 for 1-D array textures</li>

				  <li>i965/miptree: Set logical_depth0 == 6 for cube maps</li>

				  <li>nir: Add a nir_deref_foreach_leaf helper</li>

				  <li>nir/inline: Constant-initialize local variables in the callee if needed</li>

				  <li>anv/pipeline: Set up point coord enables</li>

				  <li>i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations</li>

				  <li>i965/vec4: Make opt_vector_float reset at the top of each block</li>

				  <li>anv/blit2d: Add a format parameter to bind_dst and create_iview</li>

				  <li>anv/blit2d: Add support for RGB destinations</li>

				  <li>anv/clear: Make cmd_clear_image take an actual VkClearValue</li>

				  <li>anv/clear: Clear E5B9G9R9 images as R32_UINT</li>

				  <li>anv: Include the pipeline layout in the shader hash</li>

				  <li>isl: Allow multisampled array textures</li>

				  <li>anv/descriptor_set: memset anv_descriptor_set_layout</li>

				  <li>anv/pipeline: Fix bind maps for fragment output arrays</li>

				  <li>anv/allocator: Correctly set the number of buckets</li>

				  <li>anv/pipeline: Properly handle OOM during shader compilation</li>

				  <li>anv: Remove unused fields from anv_pipeline_bind_map</li>

				  <li>anv: Add pipeline_has_stage guards a few places</li>

				  <li>anv: Add a struct for storing a compiled shader</li>

				  <li>anv/pipeline: Add support for caching the push constant map</li>

				  <li>anv: Rework pipeline caching</li>

				</ul>

				<p>José Fonseca (2):</p>

				<ul>

				  <li>appveyor: Install pywin32 extensions.</li>

				  <li>appveyor: Force Visual Studio 2013 image.</li>

				</ul>

				<p>Kenneth Graunke (21):</p>

				<ul>

				  <li>genxml: Add CLIPMODE_* prefix to 3DSTATE_CLIP's "Clip Mode" enum values.</li>

				  <li>genxml: Add APIMODE_D3D missing enum values and improve consistency.</li>

				  <li>anv: Fix near plane clipping on Gen7/7.5.</li>

				  <li>anv: Enable early culling on Gen7.</li>

				  <li>anv: Unify 3DSTATE_CLIP code across generations.</li>

				  <li>genxml: Rename "API Rendering Disable" to "Rendering Disable".</li>

				  <li>anv: Properly call gen75_emit_state_base_address on Haswell.</li>

				  <li>i965: Include VUE handles for GS with invocations &gt; 1.</li>

				  <li>nir: Add a base const_index to shared atomic intrinsics.</li>

				  <li>i965: Fix shared atomic intrinsics to pay attention to base.</li>

				  <li>mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats.</li>

				  <li>mesa: Don't call GenerateMipmap if Width or Height == 0.</li>

				  <li>glsl: Delete bogus ir_set_program_inouts assert.</li>

				  <li>glsl: Fix the program resource names of gl_TessLevelOuter/Inner[].</li>

				  <li>glsl: Fix location bias for patch variables.</li>

				  <li>glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00.</li>

				  <li>mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case.</li>

				  <li>nir/builder: Add bany_inequal and bany helpers.</li>

				  <li>i965: Implement the WaPreventHSTessLevelsInterference workaround.</li>

				  <li>i965: Fix execution size of scalar TCS barrier setup code.</li>

				  <li>i965: Fix barrier count shift in scalar TCS backend.</li>

				</ul>

				<p>Leo Liu (2):</p>

				<ul>

				  <li>st/omx/enc: check uninitialized list from task release</li>

				  <li>vl/dri3: fix a memory leak from front buffer</li>

				</ul>

				<p>Marek Olšák (7):</p>

				<ul>

				  <li>glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast</li>

				  <li>radeonsi: add a workaround for a compute VGPR-usage LLVM bug</li>

				  <li>winsys/amdgpu: disallow DCC with mipmaps</li>

				  <li>gallium/util: fix align64</li>

				  <li>radeonsi: only set dual source blending for MRT0</li>

				  <li>radeonsi: fix VM faults due NULL internal const buffers on CIK</li>

				  <li>radeonsi: disable SDMA texture copying on Carrizo</li>

				</ul>

				<p>Matt Turner (4):</p>

				<ul>

				  <li>mapi: Massage code to allow clang to compile.</li>

				  <li>i965/vec4: Ignore swizzle of VGRF for use by var_range_end().</li>

				  <li>mesa: Use AC_HEADER_MAJOR to include correct header for major().</li>

				  <li>nir: Walk blocks in source code order in lower_vars_to_ssa.</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>glx: Don't use current context in __glXSendError</li>

				</ul>

				<p>Miklós Máté (1):</p>

				<ul>

				  <li>vbo: set draw_id</li>

				</ul>

				<p>Nanley Chery (5):</p>

				<ul>

				  <li>anv/descriptor_set: Fix binding partly undefined descriptor sets</li>

				  <li>isl: Fix assert on raw buffer surface state size</li>

				  <li>anv/device: Fix max buffer range limits</li>

				  <li>isl: Fix isl_tiling_is_any_y()</li>

				  <li>anv/gen7_pipeline: Set PixelShaderKillPixel for discards</li>

				</ul>

				<p>Nicolai Hähnle (7):</p>

				<ul>

				  <li>radeonsi: explicitly choose center locations for 1xAA on Polaris</li>

				  <li>radeonsi: fix Polaris MSAA regression</li>

				  <li>radeonsi: ensure sample locations are set for line and polygon smoothing</li>

				  <li>st_glsl_to_tgsi: only skip over slots of an input array that are present</li>

				  <li>glsl: fix optimization of discard nested multiple levels</li>

				  <li>radeonsi: flush TC L2 cache for indirect draw data</li>

				  <li>radeonsi: add si_set_rw_buffer to be used for internal descriptors</li>

				</ul>

				<p>Nicolas Boichat (6):</p>

				<ul>

				  <li>egl/dri2: dri2_make_current: Set EGL error if bindContext fails</li>

				  <li>egl/wayland: Set disp-&gt;DriverData to NULL on error</li>

				  <li>egl/surfaceless: Set disp-&gt;DriverData to NULL on error</li>

				  <li>egl/drm: Set disp-&gt;DriverData to NULL on error</li>

				  <li>egl/android: Set dpy-&gt;DriverData to NULL on error</li>

				  <li>egl/dri2: Add reference count for dri2_egl_display</li>

				</ul>

				<p>Rob Herring (3):</p>

				<ul>

				  <li>Android: add missing u_math.h include path for libmesa_isl</li>

				  <li>vc4: fix vc4_resource_from_handle() stride calculation</li>

				  <li>vc4: add hash table look-up for exported dmabufs</li>

				</ul>

				<p>Samuel Pitoiset (7):</p>

				<ul>

				  <li>nvc0/ir: fix images indirect access on Fermi</li>

				  <li>nvc0: fix the driver cb size when draw parameters are used</li>

				  <li>gm107/ir: add missing NEG modifier for IADD32I</li>

				  <li>gm107/ir: make use of ADD32I for all immediates</li>

				  <li>nvc0: upload sample locations on GM20x</li>

				  <li>nvc0: invalidate textures/samplers on GK104+</li>

				  <li>nv50/ir: always emit the NDV bit for OP_QUADOP</li>

				</ul>

				<p>Stefan Dirsch (1):</p>

				<ul>

				  <li>Avoid overflow in 'last' variable of FindGLXFunction(...)</li>

				</ul>

				<p>Stencel, Joanna (1):</p>

				<ul>

				  <li>egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.</li>

				</ul>

				<p>Tim Rowley (2):</p>

				<ul>

				  <li>Revert "gallium: Force blend color to 16-byte alignment"</li>

				  <li>swr: switch from overriding -march to selecting features</li>

				</ul>

				<p>Tomasz Figa (8):</p>

				<ul>

				  <li>gallium/dri: Add shared glapi to LIBADD on Android</li>

				  <li>egl/android: Remove unused variables</li>

				  <li>egl/android: Check return value of dri2_get_dri_config()</li>

				  <li>egl/android: Stop leaking DRI images</li>

				  <li>gallium/winsys/kms: Fix double refcount when importing from prime FD (v2)</li>

				  <li>gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2)</li>

				  <li>gallium/winsys/kms: Move display target handle lookup to separate function</li>

				  <li>gallium/winsys/kms: Look up the GEM handle after importing a prime FD</li>

				</ul>

				</div>

				</body>

				</html>

									
										71

docs/relnotes/12.0.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,71 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.3 Release Notes / September 15, 2016</h1>

				<p>

				Mesa 12.0.3 is a bug fix release which fixes bugs found since the 12.0.3 release.

				</p>

				<p>

				Mesa 12.0.3 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				79abcfab3de30dbd416d1582a3cf6b1be308466231488775f1b7bb43be353602 mesa-12.0.3.tar.gz

				1dc86dd9b51272eee1fad3df65e18cda2e556ef1bc0b6e07cd750b9757f493b1 mesa-12.0.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97781">Bug 97781</a> - [HSW, BYT, IVB] es2-cts.gtf.gl2extensiontests.depth_texture_cube_map.depth_texture_cube_map</li>

				</ul>

				<h2>Changes</h2>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 12.0.2</li>

				  <li>Revert "i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations"</li>

				  <li>Update version to 12.0.3</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>appveyor: Update winflexbison download URL.</li>

				</ul>

				</div>

				</body>

				</html>

									
										321

docs/relnotes/12.0.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,321 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.4 Release Notes / November 10, 2016</h1>

				<p>

				Mesa 12.0.4 is a bug fix release which fixes bugs found since the 12.0.4 release.

				</p>

				<p>

				Mesa 12.0.4 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				22026ce4f1c6a7908b0d10ff057decec0a5633afe7f38a0cef5c08d0689f02a6 mesa-12.0.4.tar.gz

				5d6003da867d3f54e5000b4acdfc37e6cce5b6a4459274fdad73e24bd2f0065e mesa-12.0.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71759">Bug 71759</a> - Intel driver fails with &quot;intel_do_flush_locked failed: No such file or directory&quot; if buffer imported with EGL_NATIVE_PIXMAP_KHR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94354">Bug 94354</a> - R9285 Unigine Valley perf regression since radeonsi: use re-Z</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96770">Bug 96770</a> - include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97233">Bug 97233</a> - vkQuake VkSpecializationMapEntry related bug</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97260">Bug 97260</a> - R9 290 low performance in Linux 4.7</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97549">Bug 97549</a> - [SNB, BXT] up to 40% perf drop from &quot;loader/dri3: Overhaul dri3_update_num_back&quot; commit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97887">Bug 97887</a> - llvm segfault in janusvr -render vive</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98025">Bug 98025</a> - [radeonsi] incorrect primitive restart index used</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98326">Bug 98326</a> - [dEQP, EGL] pbuffer depth/stencil tests fail</li>

				</ul>

				<h2>Changes</h2>

				<p>Axel Davy (4):</p>

				<ul>

				  <li>gallium/util: Really allow aliasing of dst for u_box_union_*</li>

				  <li>st/nine: Fix the calculation of the number of vs inputs</li>

				  <li>st/nine: Fix mistake in Volume9 UnlockBox</li>

				  <li>st/nine: Fix locking CubeTexture surfaces.</li>

				</ul>

				<p>Brendan King (1):</p>

				<ul>

				  <li>configure.ac: fix the name of the Wayland Scanner pc file</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj()</li>

				</ul>

				<p>Chad Versace (3):</p>

				<ul>

				  <li>egl: Fix truncation error in _eglParseSyncAttribList64</li>

				  <li>i965/sync: Fix uninitalized usage and leak of mutex</li>

				  <li>egl: Don't advertise unsupported platform extensions</li>

				</ul>

				<p>Chuanbo Weng (1):</p>

				<ul>

				  <li>gbm: fix potential NULL deref of mapImage/unmapImage.</li>

				</ul>

				<p>Chuck Atkins (1):</p>

				<ul>

				  <li>autoconf: Make header install distinct for various APIs (v2)</li>

				</ul>

				<p>Dave Airlie (3):</p>

				<ul>

				  <li>anv: initialise and increment send_sbc</li>

				  <li>anv/wsi: fix apps that acquire multiple images up front</li>

				  <li>Revert "st/vdpau: use linear layout for output surfaces"</li>

				</ul>

				<p>Emil Velikov (12):</p>

				<ul>

				  <li>docs: add sha256 checksums for 12.0.3</li>

				  <li>cherry-ignore: add non-applicable i965 commit</li>

				  <li>cherry-ignore: add vaapi encode fix</li>

				  <li>cherry-ignore: add EGL_KHR_debug fix</li>

				  <li>cherry-ignore: add update_renderbuffer_read_surfaces()</li>

				  <li>isl/gen6: correctly check msaa layout samples count</li>

				  <li>egl/x11: don't crash if dri2_dpy-&gt;conn is NULL</li>

				  <li>get-pick-list.sh: Require explicit "12.0" for nominating stable patches</li>

				  <li>automake: don't forget to pick wglext.h in the tarball</li>

				  <li>cherry-ignore: add N/A EGL revert</li>

				  <li>cherry-ignore: add ClientWaitSync fixes</li>

				  <li>Update version to 12.0.4</li>

				</ul>

				<p>Eric Anholt (5):</p>

				<ul>

				  <li>travis: Parse configure.ac to pick an updated LIBDRM_VERSION.</li>

				  <li>travis: Update to the Ubuntu Trusty image.</li>

				  <li>travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.</li>

				  <li>travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.</li>

				  <li>gallium: Fix install-gallium-links.mk on non-bash /bin/sh</li>

				</ul>

				<p>Hans de Goede (1):</p>

				<ul>

				  <li>pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept</li>

				</ul>

				<p>Ilia Mirkin (16):</p>

				<ul>

				  <li>nv30: set usage to staging so that the buffer is allocated in GART</li>

				  <li>a3xx: make sure to actually clamp depth as requested</li>

				  <li>a3xx: make use of software clipping when hw can't handle it</li>

				  <li>a3xx: use window scissor to simulate viewport xy clip</li>

				  <li>main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer</li>

				  <li>mesa/formatquery: limit ES target support, fix core context support</li>

				  <li>nir: fix definition of pack_uvec2_to_uint</li>

				  <li>gm107/ir: AL2P writes to a predicate register</li>

				  <li>st/mesa: fix is_scissor_enabled when X/Y are negative</li>

				  <li>nvc0/ir: fix overwriting of value backing non-constant gather offset</li>

				  <li>nv50/ir: copy over value's register id when resolving merge of a phi</li>

				  <li>nvc0/ir: fix textureGather with a single offset</li>

				  <li>gm107/ir: fix texturing with indirect samplers</li>

				  <li>gm107/ir: fix bit offset of tex lod setting for indirect texturing</li>

				  <li>nv50,nvc0: avoid reading out of bounds when getting bogus so info</li>

				  <li>nv50/ir: process texture offset sources as regular sources</li>

				</ul>

				<p>James Legg (1):</p>

				<ul>

				  <li>radeonsi: Fix primitive restart when index changes</li>

				</ul>

				<p>Jason Ekstrand (9):</p>

				<ul>

				  <li>nir/spirv: Swap the argument order for AtomicCompareExchange</li>

				  <li>nir/spirv: Use the correct sources for CompareExchange on images</li>

				  <li>nir/spirv: Break variable decoration handling into a helper</li>

				  <li>nir/spirv: Refactor variable deocration handling</li>

				  <li>nir/spirv/cfg: Handle switches whose break block is a loop continue</li>

				  <li>nir/spirv/cfg: Detect switch_break after loop_break/continue</li>

				  <li>nir: Add a nop intrinsic</li>

				  <li>nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocks</li>

				  <li>intel/blorp: Rework our usage of ralloc when compiling shaders</li>

				</ul>

				<p>Jonathan Gray (3):</p>

				<ul>

				  <li>genxml: add generated headers to EXTRA_DIST</li>

				  <li>mapi: automake: set VISIBILITY_CFLAGS for shared glapi</li>

				  <li>mesa: automake: include mesa_glinterop.h in distfile</li>

				</ul>

				<p>Julien Isorce (1):</p>

				<ul>

				  <li>st/va: also honors interlaced preference when providing a video format</li>

				</ul>

				<p>Kenneth Graunke (8):</p>

				<ul>

				  <li>nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar().</li>

				  <li>mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness.</li>

				  <li>i965: Fix missing _NEW_TRANSFORM in Gen8+ 3DSTATE_DS atom.</li>

				  <li>i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP.</li>

				  <li>i965: Move BRW_NEW_FRAGMENT_PROGRAM from 3DSTATE_PS to PS_EXTRA.</li>

				  <li>i965: Add missing BRW_NEW_CS_PROG_DATA to compute constant atom.</li>

				  <li>i965: Add missing BRW_CS_PROG_DATA to CS work group surface atom.</li>

				  <li>i965: Fix gl_InvocationID in dual object GS where invocations == 1.</li>

				</ul>

				<p>Marek Olšák (12):</p>

				<ul>

				  <li>radeonsi: fix cubemaps viewed as 2D</li>

				  <li>radeonsi: take compute shader and dispatch indirect memory usage into account</li>

				  <li>radeonsi: fix FP64 UBO loads with indirect uniform block indexing</li>

				  <li>mesa: fix glGetFramebufferAttachmentParameteriv w/ on-demand FRONT_BACK alloc</li>

				  <li>radeonsi: fix interpolateAt opcodes for .zw components</li>

				  <li>radeonsi: fix texture border colors for compute shaders</li>

				  <li>radeonsi: disable ReZ</li>

				  <li>gallium/radeon: make sure the address of separate CMASK is aligned properly</li>

				  <li>winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures</li>

				  <li>egl: use util/macros.h</li>

				  <li>egl: make interop ABI visible again</li>

				  <li>glx: make interop ABI visible again</li>

				</ul>

				<p>Mario Kleiner (1):</p>

				<ul>

				  <li>glx: Perform check for valid fbconfig against proper X-Screen.</li>

				</ul>

				<p>Martin Peres (2):</p>

				<ul>

				  <li>loader/dri3: add get_dri_screen() to the vtable</li>

				  <li>loader/dri3: import prime buffers in the currently-bound screen</li>

				</ul>

				<p>Matt Whitlock (5):</p>

				<ul>

				  <li>egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>

				  <li>gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>

				  <li>st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>

				  <li>st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>

				  <li>gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>

				</ul>

				<p>Max Staudt (1):</p>

				<ul>

				  <li>r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>loader/dri3: Overhaul dri3_update_num_back</li>

				</ul>

				<p>Nicholas Bishop (2):</p>

				<ul>

				  <li>gbm: return appropriate error when queryImage() fails</li>

				  <li>st/dri: check pipe_screen-&gt;resource_get_handle() return value</li>

				</ul>

				<p>Nicolai Hähnle (10):</p>

				<ul>

				  <li>gallium/radeon: cleanup and fix branch emits</li>

				  <li>st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations</li>

				  <li>st/glsl_to_tgsi: simplify translate_tex_offset</li>

				  <li>st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets</li>

				  <li>st/mesa: fix vertex elements setup for doubles</li>

				  <li>radeonsi: fix indirect loads of 64 bit constants</li>

				  <li>st/glsl_to_tgsi: fix atomic counter addressing</li>

				  <li>st/glsl_to_tgsi: fix block copies of arrays of doubles</li>

				  <li>st/mesa: only set primitive_restart when the restart index is in range</li>

				  <li>radeonsi: fix 64-bit loads from LDS</li>

				</ul>

				<p>Samuel Pitoiset (4):</p>

				<ul>

				  <li>nvc0/ir: fix subops for IMAD</li>

				  <li>gk110/ir: fix wrong emission of OP_NOT</li>

				  <li>nvc0: use correct bufctx when invalidating CP textures</li>

				  <li>nvc0/ir: fix emission of IMAD with NEG modifiers</li>

				</ul>

				<p>Stencel, Joanna (1):</p>

				<ul>

				  <li>egl/wayland: add missing destroy_window callback</li>

				</ul>

				<p>Tapani Pälli (5):</p>

				<ul>

				  <li>egl: stop claiming support for pbuffer + msaa</li>

				  <li>egl/dri2: set max values for pbuffer width and height</li>

				  <li>egl: add check that eglCreateContext gets a valid config</li>

				  <li>mesa: fix error handling in DrawBuffers</li>

				  <li>egl: set preserved behavior for surface only if config supports it</li>

				</ul>

				<p>Tim Rowley (1):</p>

				<ul>

				  <li>configure.ac: add llvm inteljitevents component if enabled</li>

				</ul>

				<p>Vedran Miletić (1):</p>

				<ul>

				  <li>clover: Fix build against clang SVN &gt;= r273191</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>Revert "mesa_glinterop: remove inclusion of GLX header"</li>

				</ul>

				</div>

				</body>

				</html>

									
										138

docs/relnotes/12.0.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,138 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.5 Release Notes / December 5, 2016</h1>

				<p>

				Mesa 12.0.5 is a bug fix release which fixes bugs found since the 12.0.5 release.

				</p>

				<p>

				Mesa 12.0.5 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				44d08a27d98bfeacd864381189e434d98afbf451689d01f80380dc1d66450e5b  mesa-12.0.5.tar.gz

				2b0a972d8282860a11291c09c3ef01ac45171405951eb21a83c45ed2b4321924  mesa-12.0.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77662">Bug 77662</a> - Fail to render to different faces of depth-stencil cube map</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97779">Bug 97779</a> - [regression, bisected][BDW, GPU hang] stuck on render ring, always reproducible</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98415">Bug 98415</a> - Vulkan Driver JSON file contains incorrect field</li>

				</ul>

				<h2>Changes</h2>

				<p>Adam Jackson (2):</p>

				<ul>

				  <li>glx/glvnd: Don't modify the dummy slot in the dispatch table</li>

				  <li>glx/glvnd: Fix dispatch function names and indices</li>

				</ul>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>i965: Fix GPU hang related to multiple render targets and alpha testing</li>

				</ul>

				<p>Emil Velikov (4):</p>

				<ul>

				  <li>docs: add release notes for 12.0.4</li>

				  <li>docs: add sha256 checksums for 12.0.4</li>

				  <li>cherry-ignore: add reverted LLVM_LIBDIR patch</li>

				  <li>Update version to 12.0.5</li>

				</ul>

				<p>Haixia Shi (1):</p>

				<ul>

				  <li>mesa: change state query return value for RGB565</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>i965/fs/generator: Don't use the address immediate for MOV_INDIRECT</li>

				  <li>anv/cmd_buffer: Take a command buffer instead of a batch in two helpers</li>

				  <li>anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>intel: Fix pixel shader scratch space allocation on Gen9+ platforms.</li>

				</ul>

				<p>Marek Olšák (13):</p>

				<ul>

				  <li>gallium/radeon: fix behavior of GLSL findLSB(0)</li>

				  <li>gallium/radeon: make sure HTILE address is aligned properly</li>

				  <li>radeonsi: fix an assertion failure in si_decompress_sampler_color_textures</li>

				  <li>gallium/radeon: unify viewport emission code</li>

				  <li>gallium/radeon: set VPORT_ZMIN/MAX registers correctly</li>

				  <li>radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader</li>

				  <li>radeonsi: fix a crash in imageSize for cubemap arrays</li>

				  <li>radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it</li>

				  <li>gallium/radeon: add support for sharing textures with DCC between processes</li>

				  <li>radeonsi: always set all blend registers</li>

				  <li>radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending</li>

				  <li>radeonsi: disable RB+ blend optimizations for dual source blending</li>

				  <li>radeonsi: silence runtime warnings with LLVM 3.9</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>anv: Replace "abi_versions" with correct "api_version".</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>mesa/fbobject: Update CubeMapFace when reusing textures</li>

				</ul>

				<p>Steinar H. Gunderson (1):</p>

				<ul>

				  <li>Fix races during _mesa_HashWalk().</li>

				</ul>

				<p>Tim Rowley (3):</p>

				<ul>

				  <li>swr: [rasterizer jitter] cleanup supporting different llvm versions</li>

				  <li>swr: [rasterizer jitter] fix llvm-3.7 compile</li>

				  <li>swr: [rasterizer] add support for llvm-3.9</li>

				</ul>

				</div>

				</body>

				</html>

									
										148

docs/relnotes/12.0.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,148 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.6 Release Notes / January 23, 2017</h1>

				<p>

				Mesa 12.0.6 is a bug fix release which fixes bugs found since the 12.0.5 release.

				</p>

				<p>

				Mesa 12.0.6 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				65339ba5d76a45225b8b56f9a1da9db15c569e1d163760faa2921da0a8461741  mesa-12.0.6.tar.gz

				7d6da9744c1022a4c2ab6ad01a206984d00443fb691568011d01b3dd97e36448  mesa-12.0.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92234">Bug 92234</a> - [BDW] GPU hang in Shogun2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95130">Bug 95130</a> - Derivatives of gl_Color wrong when helper pixels used</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99030">Bug 99030</a> - [HSW, regression] transform feedback fails on Linux 4.8</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99354">Bug 99354</a> - [G71] &quot;Assertion `bkref' failed&quot; reproducible with glmark2</li>

				</ul>

				<h2>Changes</h2>

				<p>Chad Versace (3):</p>

				<ul>

				  <li>i965/mt: Disable aux surfaces after making miptree shareable</li>

				  <li>i965/mt: Disable HiZ when sharing depth buffer externally (v2)</li>

				  <li>anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 12.0.5</li>

				  <li>get-typod-pick-list.sh: add new script</li>

				  <li>automake: use shared llvm libs for make distcheck</li>

				  <li>egl/wayland: use the destroy_window_callback for swrast</li>

				  <li>Update version to 12.0.6</li>

				</ul>

				<p>Fredrik Höglund (1):</p>

				<ul>

				  <li>dri3: Fix MakeCurrent without a default framebuffer</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>nouveau: take extra push space into account for pushbuf_space calls</li>

				</ul>

				<p>Jason Ekstrand (19):</p>

				<ul>

				  <li>spirv/nir: Fix some texture opcode asserts</li>

				  <li>spirv/nir: Add support for shadow samplers that return vec4</li>

				  <li>spirv/nir: Properly handle gather components</li>

				  <li>anv/pipeline: Set binding_table.gather_texture_start</li>

				  <li>nir: Add a helper for determining the type of a texture source</li>

				  <li>nir/lower_tex: Add some helpers for working with tex sources</li>

				  <li>nir/lower_tex: Add support for lowering coordinate offsets</li>

				  <li>i965/nir: Enable NIR lowering of txf and rect offsets</li>

				  <li>i965: Get rid of the do_lower_unnormalized_offsets pass</li>

				  <li>spirv/nir: Don't increment coord_components for array lod queries</li>

				  <li>anv/image: Assert that the image format is actually supported</li>

				  <li>spirv/nir: Move opcode selection higher up in handle_texture</li>

				  <li>spirv/nir: Refactor type handling in handle_texture</li>

				  <li>nir/spirv: Refactor coordinate handling in handle_texture</li>

				  <li>spirv/nir: Handle texture projectors</li>

				  <li>spirv/nir: Add support for ImageQuerySamples</li>

				  <li>anv/device: Return the right error for failed maps</li>

				  <li>anv/device: Implicitly unmap memory objects in FreeMemory</li>

				  <li>anv/descriptor_set: Write the state offset in the surface state free list.</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass.</li>

				  <li>i965: Properly flush in hsw_pause_transform_feedback().</li>

				</ul>

				<p>Marek Olšák (6):</p>

				<ul>

				  <li>cso: don't release sampler states that are bound</li>

				  <li>radeonsi: always restore sampler states when unbinding sampler views</li>

				  <li>radeonsi: fix incorrect FMASK checking in bind_sampler_states</li>

				  <li>radeonsi: disable CE on SI + AMDGPU</li>

				  <li>radeonsi: disable the constant engine (CE) on Carrizo and Stoney</li>

				  <li>gallium/radeon: fix the draw-calls HUD query</li>

				</ul>

				<p>Matt Turner (3):</p>

				<ul>

				  <li>i965/fs: Rename opt_copy_propagate -&gt; opt_copy_propagation.</li>

				  <li>i965/fs: Add unit tests for copy propagation pass.</li>

				  <li>i965/fs: Reject copy propagation into SEL if not min/max.</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>cso: Don't restore nr_samplers in cso_restore_fragment_samplers</li>

				</ul>

				<p>Nicolai Hähnle (1):</p>

				<ul>

				  <li>radeonsi: enable WQM in PS prolog when needed</li>

				</ul>

				</div>

				</body>

				</html>

									
										34

docs/repository.html
									
												View File
												
				@@ -68,21 +68,39 @@ To get the Mesa sources anonymously (read-only):

				<h2 id="developer">Developer git Access</h2>

				<p>

				Mesa developers need to first have an account on

				<a href="http://www.freedesktop.org">freedesktop.org</a>.

				To get an account, please ask Brian or the other Mesa developers for

				permission.

				Then, if there are no objections, follow this

				<a href="http://www.freedesktop.org/wiki/AccountRequests">

				procedure</a>.

				If you wish to become a Mesa developer with git-write privilege, please

				follow this procedure:

				</p>

				<ol>

				<li>Subscribe to the

				<a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>

				mailing list.

				<li>Start contributing to the project by posting patches / review requests to

				the mesa-dev list.  Specifically,

				<ul>

				<li>Use <code>git send-mail</code> to post your patches to mesa-dev.

				<li>Wait for someone to review the code and give you a <code>Reviewed-by</code>

				statement.

				<li>You'll have to rely on another Mesa developer to push your initial patches

				after they've been reviewed.

				</ul>

				<li>After you've demonstrated the ability to write good code and have had

				a dozen or so patches accepted you can apply for an account.

				<li>Occasionally, but rarely, someone may be given a git account sooner, but

				only if they're being supervised by another Mesa developer at the same

				organization and planning to work in a limited area of the code or on a

				separate branch.

				<li>To apply for an account, follow

				<a href="http://www.freedesktop.org/wiki/AccountRequests">these directions</a>.

				It's also appreciated if you briefly describe what you intend to do (work

				on a particular driver, add a new extension, etc.) in the bugzilla record.

				</ol>

				<p>

				Once your account is established:

				</p>

				<ol>

				<li>Install the git software on your computer if needed.<br><br>

				<li>Get an initial, local copy of the repository with:

				    <pre>

				    git clone git+ssh://username@git.freedesktop.org/git/mesa/mesa

									
										45

docs/shading.html
									
												View File
												
				@@ -209,51 +209,6 @@ The final vertex and fragment programs may be interpreted in software

				(see drivers/dri/i915/i915_fragprog.c for example).

				</p>

				<h3>Code Generation Options</h3>

				<p>

				Internally, there are several options that control the compiler's code

				generation and instruction selection.

				These options are seen in the gl_shader_state struct and may be set

				by the device driver to indicate its preferences:

				<pre>

				struct gl_shader_state

				{

				   ...

				   /** Driver-selectable options: */

				   GLboolean EmitHighLevelInstructions;

				   GLboolean EmitCondCodes;

				   GLboolean EmitComments;

				};

				</pre>

				<dl>

				<dt>EmitHighLevelInstructions</dt>

				<dd>

				This option controls instruction selection for loops and conditionals.

				If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK

				instructions will be emitted.

				Otherwise, those constructs will be implemented with BRA instructions.

				</dd>

				<dt>EmitCondCodes</dt>

				<dd>

				If set, condition codes (ala GL_NV_fragment_program) will be used for

				branching and looping.

				Otherwise, ordinary registers will be used (the IF instruction will

				examine the first operand's X component and do the if-part if non-zero).

				This option is only relevant if EmitHighLevelInstructions is set.

				</dd>

				<dt>EmitComments</dt>

				<dd>

				If set, instructions will be annotated with comments to help with debugging.

				Extra NOP instructions will also be inserted.

				</dd>

				</dl>

				<h2 id="validation">Compiler Validation</h2>

				<p>

									
										3

docs/systems.html
									
												View File
												
				@@ -34,8 +34,7 @@ Hardware drivers include:

				</p>

				<ul>

				  <li>Intel i965, i945, i915.

				    See <a href="http://intellinuxgraphics.org/index.html">

				      Intel's website</a></li>

				    See <a href="https://01.org/linuxgraphics">Intel's website</a></li>

				  <li>AMD Radeon series.

				  See <a href="http://www.x.org/wiki/RadeonFeature">RadeonFeature</a></li>

				  <li>NVIDIA GPUs.

									
										2

docs/utilities.html
									
												View File
												
				@@ -31,7 +31,7 @@

				  <dd>is a very useful tool for tracking down

				  memory-related problems in your code.</dd>

				  <dt><a href="http:scan.coverity.com/projects/mesa">Coverity</a><dt>

				  <dt><a href="http://scan.coverity.com/projects/mesa">Coverity</a><dt>

				  <dd>provides static code analysis of Mesa.  If you create an account

				  you can see the results and try to fix outstanding issues.</dd>

				</dl>

8

doxygen/.gitignore vendored

View File

@@ -1,9 +1,6 @@
 *.db
 *.tag
 *.tmp
 agpgart
 array_cache
 core
 core_subset
 gallium
 gbm
@@ -13,11 +10,8 @@ i965
 main
 math
 math_subset
 miniglx
 radeondrm
 radeonfb
 nir
 radeon_subset
 shader
 swrast
 swrast_setup
 tnl

									
										7

doxygen/Makefile
									
												View File
												
				@@ -12,20 +12,21 @@ FULL = \

					vbo.doxy \

					glapi.doxy \

					glsl.doxy \

					shader.doxy \

					swrast.doxy \

					swrast_setup.doxy \

					tnl.doxy \

					tnl_dd.doxy \

					gbm.doxy \

					i965.doxy

					i965.doxy \

					nir.doxy

				full: $(FULL:.doxy=.tag)

					$(foreach FILE,$(FULL),doxygen $(FILE);)

				SUBSET = \

					main.doxy \

					math.doxy

					math.doxy \

					gallium.doxy

				subset: $(SUBSET:.doxy=.tag)

					$(foreach FILE,$(SUBSET),doxygen $(FILE);)

51

doxygen/common.doxy

View File

@@ -53,16 +53,6 @@ CREATE_SUBDIRS         = NO
 OUTPUT_LANGUAGE        = English
 # This tag can be used to specify the encoding used in the generated output.
 # The encoding is not always determined by the language that is chosen,
 # but also whether or not the output is meant for Windows or non-Windows users.
 # In case there is a difference, setting the USE_WINDOWS_ENCODING tag to YES
 # forces the Windows encoding (this is the default for the Windows binary),
 # whereas setting the tag to NO uses a Unix-style encoding (the default for
 # all platforms other than Windows).
 USE_WINDOWS_ENCODING   = NO
 # If the BRIEF_MEMBER_DESC tag is set to YES (the default) Doxygen will
 # include brief member descriptions after the members that are listed in
 # the file and class documentation (similar to JavaDoc).
@@ -147,13 +137,6 @@ JAVADOC_AUTOBRIEF      = YES
 MULTILINE_CPP_IS_BRIEF = NO
 # If the DETAILS_AT_TOP tag is set to YES then Doxygen
 # will output the detailed description near the top, like JavaDoc.
 # If set to NO, the detailed description appears after the member
 # documentation.
 DETAILS_AT_TOP         = YES
 # If the INHERIT_DOCS tag is set to YES (the default) then an undocumented
 # member inherits the documentation from any documented member that it
 # re-implements.
@@ -607,12 +590,6 @@ HTML_FOOTER            =
 HTML_STYLESHEET        =
 # If the HTML_ALIGN_MEMBERS tag is set to YES, the members of classes,
 # files or namespaces will be aligned in HTML using tables. If set to
 # NO a bullet list will be used.
 HTML_ALIGN_MEMBERS     = YES
 # If the GENERATE_HTMLHELP tag is set to YES, additional index files
 # will be generated that can be used as input for tools like the
 # Microsoft HTML help workshop to generate a compressed HTML help file (.chm)
@@ -839,18 +816,6 @@ GENERATE_XML           = NO
 XML_OUTPUT             = xml
 # The XML_SCHEMA tag can be used to specify an XML schema,
 # which can be used by a validating XML parser to check the
 # syntax of the XML files.
 XML_SCHEMA             =
 # The XML_DTD tag can be used to specify an XML DTD,
 # which can be used by a validating XML parser to check the
 # syntax of the XML files.
 XML_DTD                =
 # If the XML_PROGRAMLISTING tag is set to YES Doxygen will
 # dump the program listings (including syntax highlighting
 # and cross-referencing information) to the XML output. Note that
@@ -1104,22 +1069,6 @@ DOT_PATH               =
 DOTFILE_DIRS           =
 # The MAX_DOT_GRAPH_WIDTH tag can be used to set the maximum allowed width
 # (in pixels) of the graphs generated by dot. If a graph becomes larger than
 # this value, doxygen will try to truncate the graph, so that it fits within
 # the specified constraint. Beware that most browsers cannot cope with very
 # large images.
 MAX_DOT_GRAPH_WIDTH    = 1024
 # The MAX_DOT_GRAPH_HEIGHT tag can be used to set the maximum allows height
 # (in pixels) of the graphs generated by dot. If a graph becomes larger than
 # this value, doxygen will try to truncate the graph, so that it fits within
 # the specified constraint. Beware that most browsers cannot cope with very
 # large images.
 MAX_DOT_GRAPH_HEIGHT   = 1024
 # The MAX_DOT_GRAPH_DEPTH tag can be used to set the maximum depth of the
 # graphs generated by dot. A depth value of 3 means that only nodes reachable
 # from the root by following a path via at most 3 edges will be shown. Nodes that

3

doxygen/core_subset.doxy

View File

@@ -190,8 +190,7 @@ SKIP_FUNCTION_MACROS   = YES
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES		= \
 			 math_subset.tag=../math_subset \
 			 miniglx.tag=../miniglx
 			 math_subset.tag=../math_subset
 GENERATE_TAGFILE       = core_subset.tag
 ALLEXTERNALS           = NO
 PERL_PATH              =

9

doxygen/doxy.bat

View File

@@ -6,7 +6,9 @@ doxygen swrast_setup.doxy
 doxygen tnl.doxy
 doxygen core.doxy
 doxygen glapi.doxy
 doxygen shader.doxy
 doxygen glsl.doxy
 doxygen nir.doxy
 doxygen i965.doxy
 echo Building again, to resolve tags
 doxygen tnl_dd.doxy
@@ -15,5 +17,8 @@ doxygen math.doxy
 doxygen swrast.doxy
 doxygen swrast_setup.doxy
 doxygen tnl.doxy
 doxygen core.doxy
 doxygen glapi.doxy
 doxygen shader.doxy
 doxygen glsl.doxy
 doxygen nir.doxy
 doxygen i965.doxy

6

doxygen/gbm.doxy

View File

@@ -39,10 +39,10 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = main.tag=../main \
                          math.tag=../math \
                          tnl_dd.tag=../tnl_dd \
                          swrast_setup.tag=../gbm_setup \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          vbo.tag=vbo
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = gbm.tag

8

doxygen/glapi.doxy

View File

@@ -9,7 +9,7 @@ PROJECT_NAME           = "Mesa GL API dispatcher"
 #---------------------------------------------------------------------------
 # configuration options related to the input files
 #---------------------------------------------------------------------------
 INPUT                  = ../src/mesa/glapi/
 INPUT                  = ../src/mapi/glapi/
 FILE_PATTERNS          = *.c *.h
 RECURSIVE              = NO
 EXCLUDE                =
@@ -39,11 +39,11 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = main.tag=../main \
                          math.tag=../math \
                          tnl_dd.tag=../tnl_dd \
                          swrast.tag=../swrast \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          vbo.tag=vbo
 GENERATE_TAGFILE       = swrast.tag
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = glapi.tag

9

doxygen/glsl.doxy

View File

@@ -9,11 +9,12 @@ PROJECT_NAME           = "Mesa GLSL module"
 #---------------------------------------------------------------------------
 # configuration options related to the input files
 #---------------------------------------------------------------------------
 INPUT                  = ../src/glsl/
 INPUT                  = ../src/compiler/glsl/
 FILE_PATTERNS          = *.c *.cpp *.h
 RECURSIVE              = NO
 EXCLUDE                = ../src/glsl/glsl_lexer.cpp \
                          ../src/glsl/glsl_parser.cpp \
                          ../src/glsl/glsl_parser.h
 EXCLUDE                = ../src/compiler/glsl/glsl_lexer.cpp \
                          ../src/compiler/glsl/glsl_parser.cpp \
                          ../src/compiler/glsl/glsl_parser.h
 EXCLUDE_PATTERNS       =
 #---------------------------------------------------------------------------
 # configuration options related to the HTML output

									
										2

doxygen/header.html
									
												View File
												
				@@ -8,9 +8,9 @@

				<a class="qindex" href="../main/index.html">core</a> |

				<a class="qindex" href="../glapi/index.html">glapi</a> |

				<a class="qindex" href="../glsl/index.html">glsl</a> |

				<a class="qindex" href="../nir/index.html">nir</a> |

				<a class="qindex" href="../vbo/index.html">vbo</a> |

				<a class="qindex" href="../math/index.html">math</a> |

				<a class="qindex" href="../shader/index.html">shader</a> |

				<a class="qindex" href="../swrast/index.html">swrast</a> |

				<a class="qindex" href="../swrast_setup/index.html">swrast_setup</a> |

				<a class="qindex" href="../tnl/index.html">tnl</a> |

									
										1

doxygen/header_subset.html
									
												View File
												
				@@ -6,6 +6,5 @@

				<div class="qindex">

				<a class="qindex" href="../core_subset/index.html">Mesa Core</a> |

				<a class="qindex" href="../math_subset/index.html">math</a> |

				<a class="qindex" href="../miniglx/index.html">MiniGLX</a> |

				<a class="qindex" href="../radeon_subset/index.html">radeon_subset</a>

				</div>

2

doxygen/i965.doxy

View File

@@ -46,5 +46,5 @@ TAGFILES               = glsl.tag=../glsl \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          tnl_dd.tag=../tnl_dd \
                          vbo.tag=vbo
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = i965.tag

1

doxygen/main.doxy

View File

@@ -43,7 +43,6 @@ TAGFILES		= tnl_dd.tag=../tnl_dd \
 			 vbo.tag=../vbo \
                          glapi.tag=../glapi \
                          math.tag=../math \
                          shader.tag=../shader \
                          swrast.tag=../swrast \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl

2

doxygen/math.doxy

View File

@@ -41,7 +41,7 @@ SKIP_FUNCTION_MACROS   = YES
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = tnl_dd.tag=../tnl_dd \
                          main.tag=../core \
                          main.tag=../main \
                          swrast.tag=../swrast \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \

43

doxygen/shader.doxy → doxygen/nir.doxy

View File

@@ -5,45 +5,46 @@
 #---------------------------------------------------------------------------
 # General configuration options
 #---------------------------------------------------------------------------
 PROJECT_NAME           = "Mesa Vertex and Fragment Program code"
 PROJECT_NAME           = "Mesa NIR module"
 #---------------------------------------------------------------------------
 # configuration options related to the input files
 # Configuration options related to the input files
 #---------------------------------------------------------------------------
 INPUT                  = ../src/mesa/shader/
 FILE_PATTERNS          = *.c *.h
 INPUT                  = ../src/compiler/nir
 FILE_PATTERNS          = *.c *.cpp *.h
 RECURSIVE              = NO
 EXCLUDE                =
 EXCLUDE_PATTERNS       =
 EXAMPLE_PATH           =
 EXAMPLE_PATTERNS       =
 EXCLUDE                =
 EXCLUDE_PATTERNS       =
 EXAMPLE_PATH           =
 EXAMPLE_PATTERNS       =
 EXAMPLE_RECURSIVE      = NO
 IMAGE_PATH             =
 INPUT_FILTER           =
 IMAGE_PATH             =
 INPUT_FILTER           =
 FILTER_SOURCE_FILES    = NO
 #---------------------------------------------------------------------------
 # configuration options related to the HTML output
 # Configuration options related to the HTML output
 #---------------------------------------------------------------------------
 HTML_OUTPUT            = shader
 HTML_OUTPUT            = nir
 #---------------------------------------------------------------------------
 # Configuration options related to the preprocessor
 # Configuration options related to the preprocessor
 #---------------------------------------------------------------------------
 ENABLE_PREPROCESSING   = YES
 MACRO_EXPANSION        = NO
 EXPAND_ONLY_PREDEF     = NO
 SEARCH_INCLUDES        = YES
 INCLUDE_PATH           = ../include/
 INCLUDE_FILE_PATTERNS  =
 PREDEFINED             =
 EXPAND_AS_DEFINED      =
 INCLUDE_FILE_PATTERNS  =
 PREDEFINED             =
 EXPAND_AS_DEFINED      =
 SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 # Configuration::additions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = glsl.tag=../glsl \
                          main.tag=../main \
                          math.tag=../math \
                          tnl_dd.tag=../tnl_dd \
                          swrast.tag=../swrast \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          vbo.tag=vbo
 GENERATE_TAGFILE       = swrast.tag
                          tnl_dd.tag=../tnl_dd \
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = nir.tag

3

doxygen/radeon_subset.doxy

View File

@@ -168,8 +168,7 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 TAGFILES		= \
 			 core_subset.tag=../core_subset \
                          math_subset.tag=../math_subset \
                          miniglx.tag=../miniglx
                          math_subset.tag=../math_subset
 GENERATE_TAGFILE       = radeon_subset.tag
 ALLEXTERNALS           = NO
 PERL_PATH              =

4

doxygen/swrast.doxy

View File

@@ -39,10 +39,10 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = main.tag=../main \
                          math.tag=../math \
                          tnl_dd.tag=../tnl_dd \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          vbo.tag=vbo
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = swrast.tag

2

doxygen/swrast_setup.doxy

View File

@@ -41,7 +41,7 @@ SKIP_FUNCTION_MACROS   = YES
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = tnl_dd.tag=../tnl_dd \
                          main.tag=../core \
                          main.tag=../main \
                          math.tag=../math \
                          swrast.tag=../swrast \
                          tnl.tag=../tnl \

9

doxygen/tnl.doxy

View File

@@ -40,11 +40,10 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = tnl_dd.tag=../tnl \
                          main.tag=../core \
 TAGFILES               = tnl_dd.tag=../tnl_dd \
                          main.tag=../main \
                          math.tag=../math \
                          shader.tag=../shader \
                          swrast.tag=../swrast \
                          swrast_setup.tag=swrast_setup \
                          vbo.tag=vbo
                          swrast_setup.tag=../swrast_setup \
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = tnl.tag

5

doxygen/tnl_dd.doxy

View File

@@ -39,11 +39,10 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = main.tag=../main \
                          math.tag=../math \
 			 shader.tag=../shader \
                          swrast.tag=../swrast \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          vbo.tag=vbo
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = tnl_dd.tag

3

doxygen/vbo.doxy

View File

@@ -40,9 +40,8 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = main.tag=../main \
 			 math.tag=../math \
                          shader.tag=../shader \
 			 swrast.tag=../swrast \
 			 swrast_setup.tag=../swrast_setup \
 			 tnl.tag=../tnl \

									
										10

include/D3D9/d3d9.h
									
												View File
												
				@@ -260,7 +260,7 @@ struct IDirect3DDevice9 : public IUnknown

					virtual HRESULT WINAPI SetStreamSourceFreq(UINT StreamNumber, UINT Setting) = 0;

					virtual HRESULT WINAPI GetStreamSourceFreq(UINT StreamNumber, UINT *pSetting) = 0;

					virtual HRESULT WINAPI SetIndices(IDirect3DIndexBuffer9 *pIndexData) = 0;

					virtual HRESULT WINAPI GetIndices(IDirect3DIndexBuffer9 **ppIndexData, UINT *pBaseVertexIndex) = 0;

					virtual HRESULT WINAPI GetIndices(IDirect3DIndexBuffer9 **ppIndexData) = 0;

					virtual HRESULT WINAPI CreatePixelShader(const DWORD *pFunction, IDirect3DPixelShader9 **ppShader) = 0;

					virtual HRESULT WINAPI SetPixelShader(IDirect3DPixelShader9 *pShader) = 0;

					virtual HRESULT WINAPI GetPixelShader(IDirect3DPixelShader9 **ppShader) = 0;

				@@ -848,7 +848,7 @@ typedef struct IDirect3DDevice9Vtbl

					HRESULT (WINAPI *SetStreamSourceFreq)(IDirect3DDevice9 *This, UINT StreamNumber, UINT Setting);

					HRESULT (WINAPI *GetStreamSourceFreq)(IDirect3DDevice9 *This, UINT StreamNumber, UINT *pSetting);

					HRESULT (WINAPI *SetIndices)(IDirect3DDevice9 *This, IDirect3DIndexBuffer9 *pIndexData);

					HRESULT (WINAPI *GetIndices)(IDirect3DDevice9 *This, IDirect3DIndexBuffer9 **ppIndexData, UINT *pBaseVertexIndex);

					HRESULT (WINAPI *GetIndices)(IDirect3DDevice9 *This, IDirect3DIndexBuffer9 **ppIndexData);

					HRESULT (WINAPI *CreatePixelShader)(IDirect3DDevice9 *This, const DWORD *pFunction, IDirect3DPixelShader9 **ppShader);

					HRESULT (WINAPI *SetPixelShader)(IDirect3DDevice9 *This, IDirect3DPixelShader9 *pShader);

					HRESULT (WINAPI *GetPixelShader)(IDirect3DDevice9 *This, IDirect3DPixelShader9 **ppShader);

				@@ -975,7 +975,7 @@ struct IDirect3DDevice9

				#define IDirect3DDevice9_SetStreamSourceFreq(p,a,b) (p)->lpVtbl->SetStreamSourceFreq(p,a,b)

				#define IDirect3DDevice9_GetStreamSourceFreq(p,a,b) (p)->lpVtbl->GetStreamSourceFreq(p,a,b)

				#define IDirect3DDevice9_SetIndices(p,a) (p)->lpVtbl->SetIndices(p,a)

				#define IDirect3DDevice9_GetIndices(p,a,b) (p)->lpVtbl->GetIndices(p,a,b)

				#define IDirect3DDevice9_GetIndices(p,a) (p)->lpVtbl->GetIndices(p,a)

				#define IDirect3DDevice9_CreatePixelShader(p,a,b) (p)->lpVtbl->CreatePixelShader(p,a,b)

				#define IDirect3DDevice9_SetPixelShader(p,a) (p)->lpVtbl->SetPixelShader(p,a)

				#define IDirect3DDevice9_GetPixelShader(p,a) (p)->lpVtbl->GetPixelShader(p,a)

				@@ -1099,7 +1099,7 @@ typedef struct IDirect3DDevice9ExVtbl

					HRESULT (WINAPI *SetStreamSourceFreq)(IDirect3DDevice9Ex *This, UINT StreamNumber, UINT Setting);

					HRESULT (WINAPI *GetStreamSourceFreq)(IDirect3DDevice9Ex *This, UINT StreamNumber, UINT *pSetting);

					HRESULT (WINAPI *SetIndices)(IDirect3DDevice9Ex *This, IDirect3DIndexBuffer9 *pIndexData);

					HRESULT (WINAPI *GetIndices)(IDirect3DDevice9Ex *This, IDirect3DIndexBuffer9 **ppIndexData, UINT *pBaseVertexIndex);

					HRESULT (WINAPI *GetIndices)(IDirect3DDevice9Ex *This, IDirect3DIndexBuffer9 **ppIndexData);

					HRESULT (WINAPI *CreatePixelShader)(IDirect3DDevice9Ex *This, const DWORD *pFunction, IDirect3DPixelShader9 **ppShader);

					HRESULT (WINAPI *SetPixelShader)(IDirect3DDevice9Ex *This, IDirect3DPixelShader9 *pShader);

					HRESULT (WINAPI *GetPixelShader)(IDirect3DDevice9Ex *This, IDirect3DPixelShader9 **ppShader);

				@@ -1242,7 +1242,7 @@ struct IDirect3DDevice9Ex

				#define IDirect3DDevice9Ex_SetStreamSourceFreq(p,a,b) (p)->lpVtbl->SetStreamSourceFreq(p,a,b)

				#define IDirect3DDevice9Ex_GetStreamSourceFreq(p,a,b) (p)->lpVtbl->GetStreamSourceFreq(p,a,b)

				#define IDirect3DDevice9Ex_SetIndices(p,a) (p)->lpVtbl->SetIndices(p,a)

				#define IDirect3DDevice9Ex_GetIndices(p,a,b) (p)->lpVtbl->GetIndices(p,a,b)

				#define IDirect3DDevice9Ex_GetIndices(p,a) (p)->lpVtbl->GetIndices(p,a)

				#define IDirect3DDevice9Ex_CreatePixelShader(p,a,b) (p)->lpVtbl->CreatePixelShader(p,a,b)

				#define IDirect3DDevice9Ex_SetPixelShader(p,a) (p)->lpVtbl->SetPixelShader(p,a)

				#define IDirect3DDevice9Ex_GetPixelShader(p,a) (p)->lpVtbl->GetPixelShader(p,a)

									
										20

include/D3D9/d3d9types.h
									
												View File
												
				@@ -173,16 +173,16 @@ typedef struct _RGNDATA {

				#define D3DPRESENTFLAG_RESTRICTED_CONTENT              0x00000400

				#define D3DPRESENTFLAG_RESTRICT_SHARED_RESOURCE_DRIVER 0x00000800

				#ifdef WINAPI

				#undef WINAPI

				#endif /* WINAPI*/

				#if defined(__x86_64__) || defined(_M_X64)

				#define WINAPI __attribute__((ms_abi))

				#else /* x86_64 */

				#define WINAPI __attribute__((__stdcall__))

				#endif /* x86_64 */

				/* Windows calling convention */

				#ifndef WINAPI

				  #if defined(__x86_64__) && !defined(__ILP32__)

				    #define WINAPI __attribute__((ms_abi))

				  #elif defined(__i386__)

				    #define WINAPI __attribute__((__stdcall__))

				  #else /* neither amd64 nor i386 */

				    #define WINAPI

				  #endif

				#endif /* WINAPI */

				/* Implementation caps */

				#define D3DPRESENT_BACK_BUFFERS_MAX    3

									
										11

include/EGL/eglmesaext.h
									
												View File
												
				@@ -34,17 +34,6 @@ extern "C" {

				#include <EGL/eglplatform.h>

				#ifndef EGL_MESA_drm_display

				#define EGL_MESA_drm_display 1

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLDisplay EGLAPIENTRY eglGetDRMDisplayMESA(int fd);

				#endif /* EGL_EGLEXT_PROTOTYPES */

				typedef EGLDisplay (EGLAPIENTRYP PFNEGLGETDRMDISPLAYMESA) (int fd);

				#endif /* EGL_MESA_drm_display */

				#ifdef EGL_MESA_drm_image

				/* Mesa's extension to EGL_MESA_drm_image... */

				#ifndef EGL_DRM_BUFFER_USE_CURSOR_MESA

									
										108

include/GL/glcorearb.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013-2014 The Khronos Group Inc.

				** Copyright (c) 2013-2016 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -33,7 +33,7 @@ extern "C" {

				** used to make the header, and the header can be found at

				**   http://www.opengl.org/registry/

				**

				** Khronos $Revision: 27684 $ on $Date: 2014-08-11 01:21:35 -0700 (Mon, 11 Aug 2014) $

				** Khronos $Revision: 32433 $ on $Date: 2016-02-10 02:02:08 -0500 (Wed, 10 Feb 2016) $

				*/

				#if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) && !defined(__SCITECH_SNAP__)

				@@ -1160,6 +1160,22 @@ typedef unsigned short GLhalf;

				#define GL_COLOR_ATTACHMENT13             0x8CED

				#define GL_COLOR_ATTACHMENT14             0x8CEE

				#define GL_COLOR_ATTACHMENT15             0x8CEF

				#define GL_COLOR_ATTACHMENT16             0x8CF0

				#define GL_COLOR_ATTACHMENT17             0x8CF1

				#define GL_COLOR_ATTACHMENT18             0x8CF2

				#define GL_COLOR_ATTACHMENT19             0x8CF3

				#define GL_COLOR_ATTACHMENT20             0x8CF4

				#define GL_COLOR_ATTACHMENT21             0x8CF5

				#define GL_COLOR_ATTACHMENT22             0x8CF6

				#define GL_COLOR_ATTACHMENT23             0x8CF7

				#define GL_COLOR_ATTACHMENT24             0x8CF8

				#define GL_COLOR_ATTACHMENT25             0x8CF9

				#define GL_COLOR_ATTACHMENT26             0x8CFA

				#define GL_COLOR_ATTACHMENT27             0x8CFB

				#define GL_COLOR_ATTACHMENT28             0x8CFC

				#define GL_COLOR_ATTACHMENT29             0x8CFD

				#define GL_COLOR_ATTACHMENT30             0x8CFE

				#define GL_COLOR_ATTACHMENT31             0x8CFF

				#define GL_DEPTH_ATTACHMENT               0x8D00

				#define GL_STENCIL_ATTACHMENT             0x8D20

				#define GL_FRAMEBUFFER                    0x8D40

				@@ -2097,6 +2113,10 @@ GLAPI void APIENTRY glGetDoublei_v (GLenum target, GLuint index, GLdouble *data)

				#ifndef GL_VERSION_4_2

				#define GL_VERSION_4_2 1

				#define GL_COPY_READ_BUFFER_BINDING       0x8F36

				#define GL_COPY_WRITE_BUFFER_BINDING      0x8F37

				#define GL_TRANSFORM_FEEDBACK_ACTIVE      0x8E24

				#define GL_TRANSFORM_FEEDBACK_PAUSED      0x8E23

				#define GL_UNPACK_COMPRESSED_BLOCK_WIDTH  0x9127

				#define GL_UNPACK_COMPRESSED_BLOCK_HEIGHT 0x9128

				#define GL_UNPACK_COMPRESSED_BLOCK_DEPTH  0x9129

				@@ -2642,7 +2662,6 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint first, GLsizei count, const GLui

				#define GL_MAX_COMBINED_CLIP_AND_CULL_DISTANCES 0x82FA

				#define GL_TEXTURE_TARGET                 0x1006

				#define GL_QUERY_TARGET                   0x82EA

				#define GL_TEXTURE_BINDING                0x82EB

				#define GL_GUILTY_CONTEXT_RESET           0x8253

				#define GL_INNOCENT_CONTEXT_RESET         0x8254

				#define GL_UNKNOWN_CONTEXT_RESET          0x8255

				@@ -2655,25 +2674,25 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint first, GLsizei count, const GLui

				typedef void (APIENTRYP PFNGLCLIPCONTROLPROC) (GLenum origin, GLenum depth);

				typedef void (APIENTRYP PFNGLCREATETRANSFORMFEEDBACKSPROC) (GLsizei n, GLuint *ids);

				typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERBASEPROC) (GLuint xfb, GLuint index, GLuint buffer);

				typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizei size);

				typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);

				typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKIVPROC) (GLuint xfb, GLenum pname, GLint *param);

				typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI_VPROC) (GLuint xfb, GLenum pname, GLuint index, GLint *param);

				typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI64_VPROC) (GLuint xfb, GLenum pname, GLuint index, GLint64 *param);

				typedef void (APIENTRYP PFNGLCREATEBUFFERSPROC) (GLsizei n, GLuint *buffers);

				typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer, GLsizei size, const void *data, GLbitfield flags);

				typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer, GLsizei size, const void *data, GLenum usage);

				typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizei size, const void *data);

				typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizei size);

				typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer, GLsizeiptr size, const void *data, GLbitfield flags);

				typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer, GLsizeiptr size, const void *data, GLenum usage);

				typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, const void *data);

				typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);

				typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERDATAPROC) (GLuint buffer, GLenum internalformat, GLenum format, GLenum type, const void *data);

				typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizei size, GLenum format, GLenum type, const void *data);

				typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizeiptr size, GLenum format, GLenum type, const void *data);

				typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERPROC) (GLuint buffer, GLenum access);

				typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizei length, GLbitfield access);

				typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizeiptr length, GLbitfield access);

				typedef GLboolean (APIENTRYP PFNGLUNMAPNAMEDBUFFERPROC) (GLuint buffer);

				typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizei length);

				typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizeiptr length);

				typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERIVPROC) (GLuint buffer, GLenum pname, GLint *params);

				typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERI64VPROC) (GLuint buffer, GLenum pname, GLint64 *params);

				typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPOINTERVPROC) (GLuint buffer, GLenum pname, void **params);

				typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizei size, void *data);

				typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, void *data);

				typedef void (APIENTRYP PFNGLCREATEFRAMEBUFFERSPROC) (GLsizei n, GLuint *framebuffers);

				typedef void (APIENTRYP PFNGLNAMEDFRAMEBUFFERRENDERBUFFERPROC) (GLuint framebuffer, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);

				typedef void (APIENTRYP PFNGLNAMEDFRAMEBUFFERPARAMETERIPROC) (GLuint framebuffer, GLenum pname, GLint param);

				@@ -2687,7 +2706,7 @@ typedef void (APIENTRYP PFNGLINVALIDATENAMEDFRAMEBUFFERSUBDATAPROC) (GLuint fram

				typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERIVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLint *value);

				typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERUIVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLuint *value);

				typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLfloat *value);

				typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFIPROC) (GLuint framebuffer, GLenum buffer, const GLfloat depth, GLint stencil);

				typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFIPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);

				typedef void (APIENTRYP PFNGLBLITNAMEDFRAMEBUFFERPROC) (GLuint readFramebuffer, GLuint drawFramebuffer, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);

				typedef GLenum (APIENTRYP PFNGLCHECKNAMEDFRAMEBUFFERSTATUSPROC) (GLuint framebuffer, GLenum target);

				typedef void (APIENTRYP PFNGLGETNAMEDFRAMEBUFFERPARAMETERIVPROC) (GLuint framebuffer, GLenum pname, GLint *param);

				@@ -2698,7 +2717,7 @@ typedef void (APIENTRYP PFNGLNAMEDRENDERBUFFERSTORAGEMULTISAMPLEPROC) (GLuint re

				typedef void (APIENTRYP PFNGLGETNAMEDRENDERBUFFERPARAMETERIVPROC) (GLuint renderbuffer, GLenum pname, GLint *params);

				typedef void (APIENTRYP PFNGLCREATETEXTURESPROC) (GLenum target, GLsizei n, GLuint *textures);

				typedef void (APIENTRYP PFNGLTEXTUREBUFFERPROC) (GLuint texture, GLenum internalformat, GLuint buffer);

				typedef void (APIENTRYP PFNGLTEXTUREBUFFERRANGEPROC) (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizei size);

				typedef void (APIENTRYP PFNGLTEXTUREBUFFERRANGEPROC) (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizeiptr size);

				typedef void (APIENTRYP PFNGLTEXTURESTORAGE1DPROC) (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width);

				typedef void (APIENTRYP PFNGLTEXTURESTORAGE2DPROC) (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height);

				typedef void (APIENTRYP PFNGLTEXTURESTORAGE3DPROC) (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth);

				@@ -2746,6 +2765,10 @@ typedef void (APIENTRYP PFNGLGETVERTEXARRAYINDEXED64IVPROC) (GLuint vaobj, GLuin

				typedef void (APIENTRYP PFNGLCREATESAMPLERSPROC) (GLsizei n, GLuint *samplers);

				typedef void (APIENTRYP PFNGLCREATEPROGRAMPIPELINESPROC) (GLsizei n, GLuint *pipelines);

				typedef void (APIENTRYP PFNGLCREATEQUERIESPROC) (GLenum target, GLsizei n, GLuint *ids);

				typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTI64VPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);

				typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTIVPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);

				typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTUI64VPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);

				typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTUIVPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);

				typedef void (APIENTRYP PFNGLMEMORYBARRIERBYREGIONPROC) (GLbitfield barriers);

				typedef void (APIENTRYP PFNGLGETTEXTURESUBIMAGEPROC) (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, GLsizei bufSize, void *pixels);

				typedef void (APIENTRYP PFNGLGETCOMPRESSEDTEXTURESUBIMAGEPROC) (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLsizei bufSize, void *pixels);

				@@ -2762,25 +2785,25 @@ typedef void (APIENTRYP PFNGLTEXTUREBARRIERPROC) (void);

				GLAPI void APIENTRY glClipControl (GLenum origin, GLenum depth);

				GLAPI void APIENTRY glCreateTransformFeedbacks (GLsizei n, GLuint *ids);

				GLAPI void APIENTRY glTransformFeedbackBufferBase (GLuint xfb, GLuint index, GLuint buffer);

				GLAPI void APIENTRY glTransformFeedbackBufferRange (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizei size);

				GLAPI void APIENTRY glTransformFeedbackBufferRange (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);

				GLAPI void APIENTRY glGetTransformFeedbackiv (GLuint xfb, GLenum pname, GLint *param);

				GLAPI void APIENTRY glGetTransformFeedbacki_v (GLuint xfb, GLenum pname, GLuint index, GLint *param);

				GLAPI void APIENTRY glGetTransformFeedbacki64_v (GLuint xfb, GLenum pname, GLuint index, GLint64 *param);

				GLAPI void APIENTRY glCreateBuffers (GLsizei n, GLuint *buffers);

				GLAPI void APIENTRY glNamedBufferStorage (GLuint buffer, GLsizei size, const void *data, GLbitfield flags);

				GLAPI void APIENTRY glNamedBufferData (GLuint buffer, GLsizei size, const void *data, GLenum usage);

				GLAPI void APIENTRY glNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizei size, const void *data);

				GLAPI void APIENTRY glCopyNamedBufferSubData (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizei size);

				GLAPI void APIENTRY glNamedBufferStorage (GLuint buffer, GLsizeiptr size, const void *data, GLbitfield flags);

				GLAPI void APIENTRY glNamedBufferData (GLuint buffer, GLsizeiptr size, const void *data, GLenum usage);

				GLAPI void APIENTRY glNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizeiptr size, const void *data);

				GLAPI void APIENTRY glCopyNamedBufferSubData (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);

				GLAPI void APIENTRY glClearNamedBufferData (GLuint buffer, GLenum internalformat, GLenum format, GLenum type, const void *data);

				GLAPI void APIENTRY glClearNamedBufferSubData (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizei size, GLenum format, GLenum type, const void *data);

				GLAPI void APIENTRY glClearNamedBufferSubData (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizeiptr size, GLenum format, GLenum type, const void *data);

				GLAPI void *APIENTRY glMapNamedBuffer (GLuint buffer, GLenum access);

				GLAPI void *APIENTRY glMapNamedBufferRange (GLuint buffer, GLintptr offset, GLsizei length, GLbitfield access);

				GLAPI void *APIENTRY glMapNamedBufferRange (GLuint buffer, GLintptr offset, GLsizeiptr length, GLbitfield access);

				GLAPI GLboolean APIENTRY glUnmapNamedBuffer (GLuint buffer);

				GLAPI void APIENTRY glFlushMappedNamedBufferRange (GLuint buffer, GLintptr offset, GLsizei length);

				GLAPI void APIENTRY glFlushMappedNamedBufferRange (GLuint buffer, GLintptr offset, GLsizeiptr length);

				GLAPI void APIENTRY glGetNamedBufferParameteriv (GLuint buffer, GLenum pname, GLint *params);

				GLAPI void APIENTRY glGetNamedBufferParameteri64v (GLuint buffer, GLenum pname, GLint64 *params);

				GLAPI void APIENTRY glGetNamedBufferPointerv (GLuint buffer, GLenum pname, void **params);

				GLAPI void APIENTRY glGetNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizei size, void *data);

				GLAPI void APIENTRY glGetNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizeiptr size, void *data);

				GLAPI void APIENTRY glCreateFramebuffers (GLsizei n, GLuint *framebuffers);

				GLAPI void APIENTRY glNamedFramebufferRenderbuffer (GLuint framebuffer, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);

				GLAPI void APIENTRY glNamedFramebufferParameteri (GLuint framebuffer, GLenum pname, GLint param);

				@@ -2794,7 +2817,7 @@ GLAPI void APIENTRY glInvalidateNamedFramebufferSubData (GLuint framebuffer, GLs

				GLAPI void APIENTRY glClearNamedFramebufferiv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLint *value);

				GLAPI void APIENTRY glClearNamedFramebufferuiv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLuint *value);

				GLAPI void APIENTRY glClearNamedFramebufferfv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLfloat *value);

				GLAPI void APIENTRY glClearNamedFramebufferfi (GLuint framebuffer, GLenum buffer, const GLfloat depth, GLint stencil);

				GLAPI void APIENTRY glClearNamedFramebufferfi (GLuint framebuffer, GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);

				GLAPI void APIENTRY glBlitNamedFramebuffer (GLuint readFramebuffer, GLuint drawFramebuffer, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);

				GLAPI GLenum APIENTRY glCheckNamedFramebufferStatus (GLuint framebuffer, GLenum target);

				GLAPI void APIENTRY glGetNamedFramebufferParameteriv (GLuint framebuffer, GLenum pname, GLint *param);

				@@ -2805,7 +2828,7 @@ GLAPI void APIENTRY glNamedRenderbufferStorageMultisample (GLuint renderbuffer,

				GLAPI void APIENTRY glGetNamedRenderbufferParameteriv (GLuint renderbuffer, GLenum pname, GLint *params);

				GLAPI void APIENTRY glCreateTextures (GLenum target, GLsizei n, GLuint *textures);

				GLAPI void APIENTRY glTextureBuffer (GLuint texture, GLenum internalformat, GLuint buffer);

				GLAPI void APIENTRY glTextureBufferRange (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizei size);

				GLAPI void APIENTRY glTextureBufferRange (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizeiptr size);

				GLAPI void APIENTRY glTextureStorage1D (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width);

				GLAPI void APIENTRY glTextureStorage2D (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height);

				GLAPI void APIENTRY glTextureStorage3D (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth);

				@@ -2853,6 +2876,10 @@ GLAPI void APIENTRY glGetVertexArrayIndexed64iv (GLuint vaobj, GLuint index, GLe

				GLAPI void APIENTRY glCreateSamplers (GLsizei n, GLuint *samplers);

				GLAPI void APIENTRY glCreateProgramPipelines (GLsizei n, GLuint *pipelines);

				GLAPI void APIENTRY glCreateQueries (GLenum target, GLsizei n, GLuint *ids);

				GLAPI void APIENTRY glGetQueryBufferObjecti64v (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);

				GLAPI void APIENTRY glGetQueryBufferObjectiv (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);

				GLAPI void APIENTRY glGetQueryBufferObjectui64v (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);

				GLAPI void APIENTRY glGetQueryBufferObjectuiv (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);

				GLAPI void APIENTRY glMemoryBarrierByRegion (GLbitfield barriers);

				GLAPI void APIENTRY glGetTextureSubImage (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, GLsizei bufSize, void *pixels);

				GLAPI void APIENTRY glGetCompressedTextureSubImage (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLsizei bufSize, void *pixels);

				@@ -2990,8 +3017,6 @@ GLAPI void APIENTRY glDispatchComputeGroupSizeARB (GLuint num_groups_x, GLuint n

				#ifndef GL_ARB_copy_buffer

				#define GL_ARB_copy_buffer 1

				#define GL_COPY_READ_BUFFER_BINDING       0x8F36

				#define GL_COPY_WRITE_BUFFER_BINDING      0x8F37

				#endif /* GL_ARB_copy_buffer */

				#ifndef GL_ARB_copy_image

				@@ -3346,13 +3371,13 @@ GLAPI void APIENTRY glGetNamedStringivARB (GLint namelen, const GLchar *name, GL

				#define GL_ARB_sparse_buffer 1

				#define GL_SPARSE_STORAGE_BIT_ARB         0x0400

				#define GL_SPARSE_BUFFER_PAGE_SIZE_ARB    0x82F8

				typedef void (APIENTRYP PFNGLBUFFERPAGECOMMITMENTARBPROC) (GLenum target, GLintptr offset, GLsizei size, GLboolean commit);

				typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTEXTPROC) (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);

				typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTARBPROC) (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);

				typedef void (APIENTRYP PFNGLBUFFERPAGECOMMITMENTARBPROC) (GLenum target, GLintptr offset, GLsizeiptr size, GLboolean commit);

				typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTEXTPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);

				typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTARBPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);

				#ifdef GL_GLEXT_PROTOTYPES

				GLAPI void APIENTRY glBufferPageCommitmentARB (GLenum target, GLintptr offset, GLsizei size, GLboolean commit);

				GLAPI void APIENTRY glNamedBufferPageCommitmentEXT (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);

				GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);

				GLAPI void APIENTRY glBufferPageCommitmentARB (GLenum target, GLintptr offset, GLsizeiptr size, GLboolean commit);

				GLAPI void APIENTRY glNamedBufferPageCommitmentEXT (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);

				GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);

				#endif

				#endif /* GL_ARB_sparse_buffer */

				@@ -3360,7 +3385,7 @@ GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offs

				#define GL_ARB_sparse_texture 1

				#define GL_TEXTURE_SPARSE_ARB             0x91A6

				#define GL_VIRTUAL_PAGE_SIZE_INDEX_ARB    0x91A7

				#define GL_MIN_SPARSE_LEVEL_ARB           0x919B

				#define GL_NUM_SPARSE_LEVELS_ARB          0x91AA

				#define GL_NUM_VIRTUAL_PAGE_SIZES_ARB     0x91A8

				#define GL_VIRTUAL_PAGE_SIZE_X_ARB        0x9195

				#define GL_VIRTUAL_PAGE_SIZE_Y_ARB        0x9196

				@@ -3369,9 +3394,9 @@ GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offs

				#define GL_MAX_SPARSE_3D_TEXTURE_SIZE_ARB 0x9199

				#define GL_MAX_SPARSE_ARRAY_TEXTURE_LAYERS_ARB 0x919A

				#define GL_SPARSE_TEXTURE_FULL_ARRAY_CUBE_MIPMAPS_ARB 0x91A9

				typedef void (APIENTRYP PFNGLTEXPAGECOMMITMENTARBPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLboolean resident);

				typedef void (APIENTRYP PFNGLTEXPAGECOMMITMENTARBPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLboolean commit);

				#ifdef GL_GLEXT_PROTOTYPES

				GLAPI void APIENTRY glTexPageCommitmentARB (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLboolean resident);

				GLAPI void APIENTRY glTexPageCommitmentARB (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLboolean commit);

				#endif

				#endif /* GL_ARB_sparse_texture */

				@@ -3479,8 +3504,6 @@ GLAPI void APIENTRY glTexPageCommitmentARB (GLenum target, GLint level, GLint xo

				#ifndef GL_ARB_transform_feedback2

				#define GL_ARB_transform_feedback2 1

				#define GL_TRANSFORM_FEEDBACK_PAUSED      0x8E23

				#define GL_TRANSFORM_FEEDBACK_ACTIVE      0x8E24

				#endif /* GL_ARB_transform_feedback2 */

				#ifndef GL_ARB_transform_feedback3

				@@ -3537,6 +3560,11 @@ GLAPI void APIENTRY glTexPageCommitmentARB (GLenum target, GLint level, GLint xo

				#define GL_KHR_debug 1

				#endif /* GL_KHR_debug */

				#ifndef GL_KHR_no_error

				#define GL_KHR_no_error 1

				#define GL_CONTEXT_FLAG_NO_ERROR_BIT_KHR  0x00000008

				#endif /* GL_KHR_no_error */

				#ifndef GL_KHR_robust_buffer_access_behavior

				#define GL_KHR_robust_buffer_access_behavior 1

				#endif /* GL_KHR_robust_buffer_access_behavior */

				@@ -3582,6 +3610,10 @@ GLAPI void APIENTRY glTexPageCommitmentARB (GLenum target, GLint level, GLint xo

				#define GL_KHR_texture_compression_astc_ldr 1

				#endif /* GL_KHR_texture_compression_astc_ldr */

				#ifndef GL_KHR_texture_compression_astc_sliced_3d

				#define GL_KHR_texture_compression_astc_sliced_3d 1

				#endif /* GL_KHR_texture_compression_astc_sliced_3d */

				#ifdef __cplusplus

				}

				#endif

									
										87

include/GL/glext.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013-2015 The Khronos Group Inc.

				** Copyright (c) 2013-2016 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -33,7 +33,7 @@ extern "C" {

				** used to make the header, and the header can be found at

				**   http://www.opengl.org/registry/

				**

				** Khronos $Revision: 31811 $ on $Date: 2015-08-10 17:01:11 +1000 (Mon, 10 Aug 2015) $

				** Khronos $Revision: 32957 $ on $Date: 2016-06-09 17:03:08 -0400 (Thu, 09 Jun 2016) $

				*/

				#if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) && !defined(__SCITECH_SNAP__)

				@@ -53,7 +53,7 @@ extern "C" {

				#define GLAPI extern

				#endif

				#define GL_GLEXT_VERSION 20150809

				#define GL_GLEXT_VERSION 20160609

				/* Generated C header for:

				 * API: gl

				@@ -2654,7 +2654,7 @@ typedef void (APIENTRYP PFNGLINVALIDATENAMEDFRAMEBUFFERSUBDATAPROC) (GLuint fram

				typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERIVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLint *value);

				typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERUIVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLuint *value);

				typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLfloat *value);

				typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFIPROC) (GLuint framebuffer, GLenum buffer, const GLfloat depth, GLint stencil);

				typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFIPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);

				typedef void (APIENTRYP PFNGLBLITNAMEDFRAMEBUFFERPROC) (GLuint readFramebuffer, GLuint drawFramebuffer, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);

				typedef GLenum (APIENTRYP PFNGLCHECKNAMEDFRAMEBUFFERSTATUSPROC) (GLuint framebuffer, GLenum target);

				typedef void (APIENTRYP PFNGLGETNAMEDFRAMEBUFFERPARAMETERIVPROC) (GLuint framebuffer, GLenum pname, GLint *param);

				@@ -2777,7 +2777,7 @@ GLAPI void APIENTRY glInvalidateNamedFramebufferSubData (GLuint framebuffer, GLs

				GLAPI void APIENTRY glClearNamedFramebufferiv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLint *value);

				GLAPI void APIENTRY glClearNamedFramebufferuiv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLuint *value);

				GLAPI void APIENTRY glClearNamedFramebufferfv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLfloat *value);

				GLAPI void APIENTRY glClearNamedFramebufferfi (GLuint framebuffer, GLenum buffer, const GLfloat depth, GLint stencil);

				GLAPI void APIENTRY glClearNamedFramebufferfi (GLuint framebuffer, GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);

				GLAPI void APIENTRY glBlitNamedFramebuffer (GLuint readFramebuffer, GLuint drawFramebuffer, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);

				GLAPI GLenum APIENTRY glCheckNamedFramebufferStatus (GLuint framebuffer, GLenum target);

				GLAPI void APIENTRY glGetNamedFramebufferParameteriv (GLuint framebuffer, GLenum pname, GLint *param);

				@@ -4984,6 +4984,10 @@ GLAPI void APIENTRY glBlendBarrierKHR (void);

				#define GL_KHR_texture_compression_astc_ldr 1

				#endif /* GL_KHR_texture_compression_astc_ldr */

				#ifndef GL_KHR_texture_compression_astc_sliced_3d

				#define GL_KHR_texture_compression_astc_sliced_3d 1

				#endif /* GL_KHR_texture_compression_astc_sliced_3d */

				#ifndef GL_OES_byte_coordinates

				#define GL_OES_byte_coordinates 1

				typedef void (APIENTRYP PFNGLMULTITEXCOORD1BOESPROC) (GLenum texture, GLbyte s);

				@@ -5597,6 +5601,10 @@ GLAPI void APIENTRY glSetMultisamplefvAMD (GLenum pname, GLuint index, const GLf

				#define GL_AMD_shader_atomic_counter_ops 1

				#endif /* GL_AMD_shader_atomic_counter_ops */

				#ifndef GL_AMD_shader_explicit_vertex_parameter

				#define GL_AMD_shader_explicit_vertex_parameter 1

				#endif /* GL_AMD_shader_explicit_vertex_parameter */

				#ifndef GL_AMD_shader_stencil_export

				#define GL_AMD_shader_stencil_export 1

				#endif /* GL_AMD_shader_stencil_export */

				@@ -8637,6 +8645,20 @@ GLAPI void APIENTRY glVertexWeightPointerEXT (GLint size, GLenum type, GLsizei s

				#endif

				#endif /* GL_EXT_vertex_weighting */

				#ifndef GL_EXT_window_rectangles

				#define GL_EXT_window_rectangles 1

				#define GL_INCLUSIVE_EXT                  0x8F10

				#define GL_EXCLUSIVE_EXT                  0x8F11

				#define GL_WINDOW_RECTANGLE_EXT           0x8F12

				#define GL_WINDOW_RECTANGLE_MODE_EXT      0x8F13

				#define GL_MAX_WINDOW_RECTANGLES_EXT      0x8F14

				#define GL_NUM_WINDOW_RECTANGLES_EXT      0x8F15

				typedef void (APIENTRYP PFNGLWINDOWRECTANGLESEXTPROC) (GLenum mode, GLsizei count, const GLint *box);

				#ifdef GL_GLEXT_PROTOTYPES

				GLAPI void APIENTRY glWindowRectanglesEXT (GLenum mode, GLsizei count, const GLint *box);

				#endif

				#endif /* GL_EXT_window_rectangles */

				#ifndef GL_EXT_x11_sync_object

				#define GL_EXT_x11_sync_object 1

				#define GL_SYNC_X11_FENCE_EXT             0x90E1

				@@ -9130,6 +9152,17 @@ GLAPI void APIENTRY glBlendBarrierNV (void);

				#define GL_NV_blend_square 1

				#endif /* GL_NV_blend_square */

				#ifndef GL_NV_clip_space_w_scaling

				#define GL_NV_clip_space_w_scaling 1

				#define GL_VIEWPORT_POSITION_W_SCALE_NV   0x937C

				#define GL_VIEWPORT_POSITION_W_SCALE_X_COEFF_NV 0x937D

				#define GL_VIEWPORT_POSITION_W_SCALE_Y_COEFF_NV 0x937E

				typedef void (APIENTRYP PFNGLVIEWPORTPOSITIONWSCALENVPROC) (GLuint index, GLfloat xcoeff, GLfloat ycoeff);

				#ifdef GL_GLEXT_PROTOTYPES

				GLAPI void APIENTRY glViewportPositionWScaleNV (GLuint index, GLfloat xcoeff, GLfloat ycoeff);

				#endif

				#endif /* GL_NV_clip_space_w_scaling */

				#ifndef GL_NV_command_list

				#define GL_NV_command_list 1

				#define GL_TERMINATE_SEQUENCE_COMMAND_NV  0x0000

				@@ -9232,6 +9265,17 @@ GLAPI void APIENTRY glConservativeRasterParameterfNV (GLenum pname, GLfloat valu

				#endif

				#endif /* GL_NV_conservative_raster_dilate */

				#ifndef GL_NV_conservative_raster_pre_snap_triangles

				#define GL_NV_conservative_raster_pre_snap_triangles 1

				#define GL_CONSERVATIVE_RASTER_MODE_NV    0x954D

				#define GL_CONSERVATIVE_RASTER_MODE_POST_SNAP_NV 0x954E

				#define GL_CONSERVATIVE_RASTER_MODE_PRE_SNAP_TRIANGLES_NV 0x954F

				typedef void (APIENTRYP PFNGLCONSERVATIVERASTERPARAMETERINVPROC) (GLenum pname, GLint param);

				#ifdef GL_GLEXT_PROTOTYPES

				GLAPI void APIENTRY glConservativeRasterParameteriNV (GLenum pname, GLint param);

				#endif

				#endif /* GL_NV_conservative_raster_pre_snap_triangles */

				#ifndef GL_NV_copy_depth_to_color

				#define GL_NV_copy_depth_to_color 1

				#define GL_DEPTH_STENCIL_TO_RGBA_NV       0x886E

				@@ -10224,6 +10268,11 @@ GLAPI void APIENTRY glGetCombinerStageParameterfvNV (GLenum stage, GLenum pname,

				#endif

				#endif /* GL_NV_register_combiners2 */

				#ifndef GL_NV_robustness_video_memory_purge

				#define GL_NV_robustness_video_memory_purge 1

				#define GL_PURGED_CONTEXT_RESET_NV        0x92BB

				#endif /* GL_NV_robustness_video_memory_purge */

				#ifndef GL_NV_sample_locations

				#define GL_NV_sample_locations 1

				#define GL_SAMPLE_LOCATION_SUBPIXEL_BITS_NV 0x933D

				@@ -10256,6 +10305,10 @@ GLAPI void APIENTRY glResolveDepthValuesNV (void);

				#define GL_NV_shader_atomic_float 1

				#endif /* GL_NV_shader_atomic_float */

				#ifndef GL_NV_shader_atomic_float64

				#define GL_NV_shader_atomic_float64 1

				#endif /* GL_NV_shader_atomic_float64 */

				#ifndef GL_NV_shader_atomic_fp16_vector

				#define GL_NV_shader_atomic_fp16_vector 1

				#endif /* GL_NV_shader_atomic_fp16_vector */

				@@ -10319,6 +10372,10 @@ GLAPI void APIENTRY glProgramUniformui64vNV (GLuint program, GLint location, GLs

				#define GL_NV_shader_thread_shuffle 1

				#endif /* GL_NV_shader_thread_shuffle */

				#ifndef GL_NV_stereo_view_rendering

				#define GL_NV_stereo_view_rendering 1

				#endif /* GL_NV_stereo_view_rendering */

				#ifndef GL_NV_tessellation_program5

				#define GL_NV_tessellation_program5 1

				#define GL_MAX_PROGRAM_PATCH_ATTRIBS_NV   0x86D8

				@@ -11089,6 +11146,26 @@ GLAPI void APIENTRY glVideoCaptureStreamParameterdvNV (GLuint video_capture_slot

				#define GL_NV_viewport_array2 1

				#endif /* GL_NV_viewport_array2 */

				#ifndef GL_NV_viewport_swizzle

				#define GL_NV_viewport_swizzle 1

				#define GL_VIEWPORT_SWIZZLE_POSITIVE_X_NV 0x9350

				#define GL_VIEWPORT_SWIZZLE_NEGATIVE_X_NV 0x9351

				#define GL_VIEWPORT_SWIZZLE_POSITIVE_Y_NV 0x9352

				#define GL_VIEWPORT_SWIZZLE_NEGATIVE_Y_NV 0x9353

				#define GL_VIEWPORT_SWIZZLE_POSITIVE_Z_NV 0x9354

				#define GL_VIEWPORT_SWIZZLE_NEGATIVE_Z_NV 0x9355

				#define GL_VIEWPORT_SWIZZLE_POSITIVE_W_NV 0x9356

				#define GL_VIEWPORT_SWIZZLE_NEGATIVE_W_NV 0x9357

				#define GL_VIEWPORT_SWIZZLE_X_NV          0x9358

				#define GL_VIEWPORT_SWIZZLE_Y_NV          0x9359

				#define GL_VIEWPORT_SWIZZLE_Z_NV          0x935A

				#define GL_VIEWPORT_SWIZZLE_W_NV          0x935B

				typedef void (APIENTRYP PFNGLVIEWPORTSWIZZLENVPROC) (GLuint index, GLenum swizzlex, GLenum swizzley, GLenum swizzlez, GLenum swizzlew);

				#ifdef GL_GLEXT_PROTOTYPES

				GLAPI void APIENTRY glViewportSwizzleNV (GLuint index, GLenum swizzlex, GLenum swizzley, GLenum swizzlez, GLenum swizzlew);

				#endif

				#endif /* GL_NV_viewport_swizzle */

				#ifndef GL_OML_interlace

				#define GL_OML_interlace 1

				#define GL_INTERLACE_OML                  0x8980

									
										70

include/GL/internal/dri_interface.h
									
												View File
												
				@@ -79,6 +79,7 @@ typedef struct __DRIdri2LoaderExtensionRec	__DRIdri2LoaderExtension;

				typedef struct __DRI2flushExtensionRec	__DRI2flushExtension;

				typedef struct __DRI2throttleExtensionRec	__DRI2throttleExtension;

				typedef struct __DRI2fenceExtensionRec          __DRI2fenceExtension;

				typedef struct __DRI2interopExtensionRec	__DRI2interopExtension;

				typedef struct __DRIimageLoaderExtensionRec     __DRIimageLoaderExtension;

				@@ -392,6 +393,31 @@ struct __DRI2fenceExtensionRec {

				};

				/**

				 * Extension for API interop.

				 * See GL/mesa_glinterop.h.

				 */

				#define __DRI2_INTEROP "DRI2_Interop"

				#define __DRI2_INTEROP_VERSION 1

				struct mesa_glinterop_device_info;

				struct mesa_glinterop_export_in;

				struct mesa_glinterop_export_out;

				struct __DRI2interopExtensionRec {

				   __DRIextension base;

				   /** Same as MesaGLInterop*QueryDeviceInfo. */

				   int (*query_device_info)(__DRIcontext *ctx,

				                            struct mesa_glinterop_device_info *out);

				   /** Same as MesaGLInterop*ExportObject. */

				   int (*export_object)(__DRIcontext *ctx,

				                        struct mesa_glinterop_export_in *in,

				                        struct mesa_glinterop_export_out *out);

				};

				/*@}*/

				/**

				@@ -1068,7 +1094,7 @@ struct __DRIdri2ExtensionRec {

				 * extensions.

				 */

				#define __DRI_IMAGE "DRI_IMAGE"

				#define __DRI_IMAGE_VERSION 11

				#define __DRI_IMAGE_VERSION 12

				/**

				 * These formats correspond to the similarly named MESA_FORMAT_*

				@@ -1100,8 +1126,18 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_USE_SCANOUT		0x0002

				#define __DRI_IMAGE_USE_CURSOR		0x0004 /* Depricated */

				#define __DRI_IMAGE_USE_LINEAR		0x0008

				/* The buffer will only be read by an external process after SwapBuffers,

				 * in contrary to gbm buffers, front buffers and fake front buffers, which

				 * could be read after a flush."

				 */

				#define __DRI_IMAGE_USE_BACKBUFFER      0x0010

				#define __DRI_IMAGE_TRANSFER_READ            0x1

				#define __DRI_IMAGE_TRANSFER_WRITE           0x2

				#define __DRI_IMAGE_TRANSFER_READ_WRITE      \

				        (__DRI_IMAGE_TRANSFER_READ | __DRI_IMAGE_TRANSFER_WRITE)

				/**

				 * Four CC formats that matches with WL_DRM_FORMAT_* from wayland_drm.h,

				 * GBM_FORMAT_* from gbm.h, and DRM_FORMAT_* from drm_fourcc.h. Used with

				@@ -1127,6 +1163,11 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_FOURCC_NV16		0x3631564e

				#define __DRI_IMAGE_FOURCC_YUYV		0x56595559

				#define __DRI_IMAGE_FOURCC_YVU410	0x39555659

				#define __DRI_IMAGE_FOURCC_YVU411	0x31315659

				#define __DRI_IMAGE_FOURCC_YVU420	0x32315659

				#define __DRI_IMAGE_FOURCC_YVU422	0x36315659

				#define __DRI_IMAGE_FOURCC_YVU444	0x34325659

				/**

				 * Queryable on images created by createImageFromNames.

				@@ -1350,6 +1391,33 @@ struct __DRIimageExtensionRec {

				    * \since 10

				    */

				   int (*getCapabilities)(__DRIscreen *screen);

				   /**

				    * Returns a map of the specified region of a __DRIimage for the specified usage.

				    *

				    * flags may include __DRI_IMAGE_TRANSFER_READ, which will populate the

				    * mapping with the current buffer content. If __DRI_IMAGE_TRANSFER_READ

				    * is not included in the flags, the buffer content at map time is

				    * undefined. Users wanting to modify the mapping must include

				    * __DRI_IMAGE_TRANSFER_WRITE; if __DRI_IMAGE_TRANSFER_WRITE is not

				    * included, behaviour when writing the mapping is undefined.

				    *

				    * Returns the byte stride in *stride, and an opaque pointer to data

				    * tracking the mapping in **data, which must be passed to unmapImage().

				    *

				    * \since 12

				    */

				   void *(*mapImage)(__DRIcontext *context, __DRIimage *image,

				                     int x0, int y0, int width, int height,

				                     unsigned int flags, int *stride, void **data);

				   /**

				    * Unmap a previously mapped __DRIimage

				    *

				    * \since 12

				    */

				   void (*unmapImage)(__DRIcontext *context, __DRIimage *image, void *data);

				};

									
										304

include/GL/mesa_glinterop.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,304 @@

				/*

				 * Mesa 3-D graphics library

				 *

				 * Copyright 2016 Advanced Micro Devices, Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				 * to deal in the Software without restriction, including without limitation

				 * the rights to use, copy, modify, merge, publish, distribute, sublicense,

				 * and/or sell copies of the Software, and to permit persons to whom the

				 * Software is furnished to do so, subject to the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Software.

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS

				 * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR

				 * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,

				 * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR

				 * OTHER DEALINGS IN THE SOFTWARE.

				 */

				/* Mesa OpenGL inter-driver interoperability interface designed for but not

				 * limited to OpenCL.

				 *

				 * This is a driver-agnostic, backward-compatible interface. The structures

				 * are only allowed to grow. They can never shrink and their members can

				 * never be removed, renamed, or redefined.

				 *

				 * The interface doesn't return a lot of static texture parameters like

				 * width, height, etc. It mainly returns mutable buffer and texture view

				 * parameters that can't be part of the texture allocation (because they are

				 * mutable). If drivers want to return more data or want to return static

				 * allocation parameters, they can do it in one of these two ways:

				 * - attaching the data to the DMABUF handle in a driver-specific way

				 * - passing the data via "out_driver_data" in the "in" structure.

				 *

				 * Mesa is expected to do a lot of error checking on behalf of OpenCL, such

				 * as checking the target, miplevel, and texture completeness.

				 *

				 * OpenCL, on the other hand, needs to check if the display+context combo

				 * is compatible with the OpenCL driver by querying the device information.

				 * It also needs to check if the texture internal format and channel ordering

				 * (returned in a driver-specific way) is supported by OpenCL, among other

				 * things.

				 */

				#ifndef MESA_GLINTEROP_H

				#define MESA_GLINTEROP_H

				#include <stddef.h>

				#include <stdint.h>

				#ifdef __cplusplus

				extern "C" {

				#endif

				/* Forward declarations to avoid inclusion of GL/glx.h */

				struct _XDisplay;

				struct __GLXcontextRec;

				/* Forward declarations to avoid inclusion of EGL/egl.h */

				typedef void *EGLDisplay;

				typedef void *EGLContext;

				/** Returned error codes. */

				enum {

				   MESA_GLINTEROP_SUCCESS = 0,

				   MESA_GLINTEROP_OUT_OF_RESOURCES,

				   MESA_GLINTEROP_OUT_OF_HOST_MEMORY,

				   MESA_GLINTEROP_INVALID_OPERATION,

				   MESA_GLINTEROP_INVALID_VERSION,

				   MESA_GLINTEROP_INVALID_DISPLAY,

				   MESA_GLINTEROP_INVALID_CONTEXT,

				   MESA_GLINTEROP_INVALID_TARGET,

				   MESA_GLINTEROP_INVALID_OBJECT,

				   MESA_GLINTEROP_INVALID_MIP_LEVEL,

				   MESA_GLINTEROP_UNSUPPORTED

				};

				/** Access flags. */

				enum {

				   MESA_GLINTEROP_ACCESS_READ_WRITE = 0,

				   MESA_GLINTEROP_ACCESS_READ_ONLY,

				   MESA_GLINTEROP_ACCESS_WRITE_ONLY

				};

				#define MESA_GLINTEROP_DEVICE_INFO_VERSION 1

				/**

				 * Device information returned by Mesa.

				 */

				struct mesa_glinterop_device_info {

				   /* The caller should set this to the version of the struct they support */

				   /* The callee will overwrite it if it supports a lower version.

				    *

				    * The caller should check the value and access up-to the version supported

				    * by the the callee.

				    */

				   /* NOTE: Do not use the MESA_GLINTEROP_DEVICE_INFO_VERSION macro */

				   uint32_t version;

				   /* PCI location */

				   uint32_t pci_segment_group;

				   uint32_t pci_bus;

				   uint32_t pci_device;

				   uint32_t pci_function;

				   /* Device identification */

				   uint32_t vendor_id;

				   uint32_t device_id;

				   /* Structure version 1 ends here. */

				};

				#define MESA_GLINTEROP_EXPORT_IN_VERSION 1

				/**

				 * Input parameters to Mesa interop export functions.

				 */

				struct mesa_glinterop_export_in {

				   /* The caller should set this to the version of the struct they support */

				   /* The callee will overwrite it if it supports a lower version.

				    *

				    * The caller should check the value and access up-to the version supported

				    * by the the callee.

				    */

				   /* NOTE: Do not use the MESA_GLINTEROP_EXPORT_IN_VERSION macro */

				   uint32_t version;

				   /* One of the following:

				    * - GL_TEXTURE_BUFFER

				    * - GL_TEXTURE_1D

				    * - GL_TEXTURE_2D

				    * - GL_TEXTURE_3D

				    * - GL_TEXTURE_RECTANGLE

				    * - GL_TEXTURE_1D_ARRAY

				    * - GL_TEXTURE_2D_ARRAY

				    * - GL_TEXTURE_CUBE_MAP_ARRAY

				    * - GL_TEXTURE_CUBE_MAP

				    * - GL_TEXTURE_CUBE_MAP_POSITIVE_X

				    * - GL_TEXTURE_CUBE_MAP_NEGATIVE_X

				    * - GL_TEXTURE_CUBE_MAP_POSITIVE_Y

				    * - GL_TEXTURE_CUBE_MAP_NEGATIVE_Y

				    * - GL_TEXTURE_CUBE_MAP_POSITIVE_Z

				    * - GL_TEXTURE_CUBE_MAP_NEGATIVE_Z

				    * - GL_TEXTURE_2D_MULTISAMPLE

				    * - GL_TEXTURE_2D_MULTISAMPLE_ARRAY

				    * - GL_TEXTURE_EXTERNAL_OES

				    * - GL_RENDERBUFFER

				    * - GL_ARRAY_BUFFER

				    */

				   unsigned target;

				   /* If target is GL_ARRAY_BUFFER, it's a buffer object.

				    * If target is GL_RENDERBUFFER, it's a renderbuffer object.

				    * If target is GL_TEXTURE_*, it's a texture object.

				    */

				   unsigned obj;

				   /* Mipmap level. Ignored for non-texture objects. */

				   unsigned miplevel;

				   /* One of MESA_GLINTEROP_ACCESS_* flags. This describes how the exported

				    * object is going to be used.

				    */

				   uint32_t access;

				   /* Size of memory pointed to by out_driver_data. */

				   uint32_t out_driver_data_size;

				   /* If the caller wants to query driver-specific data about the OpenGL

				    * object, this should point to the memory where that data will be stored.

				    * This is expected to be a temporary staging memory. The pointer is not

				    * allowed to be saved for later use by Mesa.

				    */

				   void *out_driver_data;

				   /* Structure version 1 ends here. */

				};

				#define MESA_GLINTEROP_EXPORT_OUT_VERSION 1

				/**

				 * Outputs of Mesa interop export functions.

				 */

				struct mesa_glinterop_export_out {

				   /* The caller should set this to the version of the struct they support */

				   /* The callee will overwrite it if it supports a lower version.

				    *

				    * The caller should check the value and access up-to the version supported

				    * by the the callee.

				    */

				   /* NOTE: Do not use the MESA_GLINTEROP_EXPORT_OUT_VERSION macro */

				   uint32_t version;

				   /* The DMABUF handle. It must be closed by the caller using the POSIX

				    * close() function when it's not needed anymore. Mesa is not responsible

				    * for closing the handle.

				    *

				    * Not closing the handle by the caller will lead to a resource leak,

				    * will prevent releasing the GPU buffer, and may prevent creating new

				    * DMABUF handles within the process.

				    */

				   int dmabuf_fd;

				   /* The mutable OpenGL internal format specified by glTextureView or

				    * glTexBuffer. If the object is not one of those, the original internal

				    * format specified by glTexStorage, glTexImage, or glRenderbufferStorage

				    * will be returned.

				    */

				   unsigned internal_format;

				   /* Buffer offset and size for GL_ARRAY_BUFFER and GL_TEXTURE_BUFFER.

				    * This allows interop with suballocations (a buffer allocated within

				    * a larger buffer).

				    *

				    * Parameters specified by glTexBufferRange for GL_TEXTURE_BUFFER are

				    * applied to these and can shrink the range further.

				    */

				   ptrdiff_t buf_offset;

				   ptrdiff_t buf_size;

				   /* Parameters specified by glTextureView. If the object is not a texture

				    * view, default parameters covering the whole texture will be returned.

				    */

				   unsigned view_minlevel;

				   unsigned view_numlevels;

				   unsigned view_minlayer;

				   unsigned view_numlayers;

				   /* The number of bytes written to out_driver_data. */

				   uint32_t out_driver_data_written;

				   /* Structure version 1 ends here. */

				};

				/**

				 * Query device information.

				 *

				 * \param dpy        GLX display

				 * \param context    GLX context

				 * \param out        where to return the information

				 *

				 * \return MESA_GLINTEROP_SUCCESS or MESA_GLINTEROP_* != 0 on error

				 */

				int

				MesaGLInteropGLXQueryDeviceInfo(struct _XDisplay *dpy, struct __GLXcontextRec *context,

				                                struct mesa_glinterop_device_info *out);

				/**

				 * Same as MesaGLInteropGLXQueryDeviceInfo except that it accepts EGLDisplay

				 * and EGLContext.

				 */

				int

				MesaGLInteropEGLQueryDeviceInfo(EGLDisplay dpy, EGLContext context,

				                                struct mesa_glinterop_device_info *out);

				/**

				 * Create and return a DMABUF handle corresponding to the given OpenGL

				 * object, and return other parameters about the OpenGL object.

				 *

				 * \param dpy        GLX display

				 * \param context    GLX context

				 * \param in         input parameters

				 * \param out        return values

				 *

				 * \return MESA_GLINTEROP_SUCCESS or MESA_GLINTEROP_* != 0 on error

				 */

				int

				MesaGLInteropGLXExportObject(struct _XDisplay *dpy, struct __GLXcontextRec *context,

				                             struct mesa_glinterop_export_in *in,

				                             struct mesa_glinterop_export_out *out);

				/**

				 * Same as MesaGLInteropGLXExportObject except that it accepts

				 * EGLDisplay and EGLContext.

				 */

				int

				MesaGLInteropEGLExportObject(EGLDisplay dpy, EGLContext context,

				                             struct mesa_glinterop_export_in *in,

				                             struct mesa_glinterop_export_out *out);

				typedef int (PFNMESAGLINTEROPGLXQUERYDEVICEINFOPROC)(struct _XDisplay *dpy, struct __GLXcontextRec *context,

				                                                     struct mesa_glinterop_device_info *out);

				typedef int (PFNMESAGLINTEROPEGLQUERYDEVICEINFOPROC)(EGLDisplay dpy, EGLContext context,

				                                                     struct mesa_glinterop_device_info *out);

				typedef int (PFNMESAGLINTEROPGLXEXPORTOBJECTPROC)(struct _XDisplay *dpy, struct __GLXcontextRec *context,

				                                                  struct mesa_glinterop_export_in *in,

				                                                  struct mesa_glinterop_export_out *out);

				typedef int (PFNMESAGLINTEROPEGLEXPORTOBJECTPROC)(EGLDisplay dpy, EGLContext context,

				                                                  struct mesa_glinterop_export_in *in,

				                                                  struct mesa_glinterop_export_out *out);

				#ifdef __cplusplus

				}

				#endif

				#endif /* MESA_GLINTEROP_H */

									
										35

include/c11/threads_posix.h
									
												View File
												
				@@ -169,6 +169,32 @@ mtx_destroy(mtx_t *mtx)

				    pthread_mutex_destroy(mtx);

				}

				/*

				 * XXX: Workaround when building with -O0 and without pthreads link.

				 *

				 * In such cases constant folding and dead code elimination won't be

				 * available, thus the compiler will always add the pthread_mutexattr*

				 * functions into the binary. As we try to link, we'll fail as the

				 * symbols are unresolved.

				 *

				 * Ideally we'll enable the optimisations locally, yet that does not

				 * seem to work.

				 *

				 * So the alternative workaround is to annotate the symbols as weak.

				 * Thus the linker will be happy and things don't clash when building

				 * with -O1 or greater.

				 */

				#ifdef HAVE_FUNC_ATTRIBUTE_WEAK

				__attribute__((weak))

				int pthread_mutexattr_init(pthread_mutexattr_t *attr);

				__attribute__((weak))

				int pthread_mutexattr_settype(pthread_mutexattr_t *attr, int type);

				__attribute__((weak))

				int pthread_mutexattr_destroy(pthread_mutexattr_t *attr);

				#endif

				// 7.25.4.2

				static inline int

				mtx_init(mtx_t *mtx, int type)

				@@ -180,9 +206,14 @@ mtx_init(mtx_t *mtx, int type)

				      && type != (mtx_timed|mtx_recursive)

				      && type != (mtx_try|mtx_recursive))

				        return thrd_error;

				    if ((type & mtx_recursive) == 0) {

				        pthread_mutex_init(mtx, NULL);

				        return thrd_success;

				    }

				    pthread_mutexattr_init(&attr);

				    if ((type & mtx_recursive) != 0)

				        pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);

				    pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);

				    pthread_mutex_init(mtx, &attr);

				    pthread_mutexattr_destroy(&attr);

				    return thrd_success;

									
										48

include/c99_compat.h
									
												View File
												
				@@ -36,8 +36,8 @@

				 */

				#if defined(_MSC_VER)

				#  if _MSC_VER < 1800

				#    error "Microsoft Visual Studio 2013 or higher required"

				#  if _MSC_VER < 1800 || (_MSC_FULL_VER < 180031101 && !defined(__clang__))

				#    error "Microsoft Visual Studio 2013 Update 4 or higher required"

				#  endif

				   /*

				@@ -135,4 +135,48 @@ test_c99_compat_h(const void * restrict a,

				#endif

				/* Fallback definitions, for build systems other than autoconfig which don't

				 * auto-detect these things. */

				#ifdef HAVE_NO_AUTOCONF

				#  ifndef _WIN32

				#    define HAVE_PTHREAD

				#    define HAVE_POSIX_MEMALIGN

				#  endif

				#  ifdef __GNUC__

				#    if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 2)

				#      error "GCC version 4.2 or higher required"

				#    endif

				     /* https://gcc.gnu.org/onlinedocs/gcc-4.2.4/gcc/Other-Builtins.html */

				#    define HAVE___BUILTIN_CLZ 1

				#    define HAVE___BUILTIN_CLZLL 1

				#    define HAVE___BUILTIN_CTZ 1

				#    define HAVE___BUILTIN_EXPECT 1

				#    define HAVE___BUILTIN_FFS 1

				#    define HAVE___BUILTIN_FFSLL 1

				#    define HAVE___BUILTIN_POPCOUNT 1

				#    define HAVE___BUILTIN_POPCOUNTLL 1

				     /* https://gcc.gnu.org/onlinedocs/gcc-4.2.4/gcc/Function-Attributes.html */

				#    define HAVE_FUNC_ATTRIBUTE_FLATTEN 1

				#    define HAVE_FUNC_ATTRIBUTE_UNUSED 1

				#    define HAVE_FUNC_ATTRIBUTE_FORMAT 1

				#    define HAVE_FUNC_ATTRIBUTE_PACKED 1

				#    if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)

				       /* https://gcc.gnu.org/onlinedocs/gcc-4.3.6/gcc/Other-Builtins.html */

				#      define HAVE___BUILTIN_BSWAP32 1

				#      define HAVE___BUILTIN_BSWAP64 1

				#    endif

				#    if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 5)

				#      define HAVE___BUILTIN_UNREACHABLE 1

				#    endif

				#  endif /* __GNUC__ */

				#endif /* !HAVE_AUTOCONF */

				#endif /* _C99_COMPAT_H_ */

									
										23

include/c99_math.h
									
												View File
												
				@@ -185,4 +185,27 @@ fpclassify(double x)

				#endif

				/* Since C++11, the following functions are part of the std namespace. Their C

				 * counteparts should still exist in the global namespace, however cmath

				 * undefines those functions, which in glibc 2.23, are defined as macros rather

				 * than functions as in glibc 2.22.

				 */

				#if __cplusplus >= 201103L && (__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 23))

				#include <cmath>

				using std::fpclassify;

				using std::isfinite;

				using std::isinf;

				using std::isnan;

				using std::isnormal;

				using std::signbit;

				using std::isgreater;

				using std::isgreaterequal;

				using std::isless;

				using std::islessequal;

				using std::islessgreater;

				using std::isunordered;

				#endif

				#endif /* #define _C99_MATH_H_ */

									
										6

include/d3dadapter/drm.h
									
												View File
												
				@@ -29,7 +29,11 @@

				#define D3DADAPTER9DRM_NAME "drm"

				/* current version */

				#define D3DADAPTER9DRM_MAJOR 0

				#define D3DADAPTER9DRM_MINOR 0

				#define D3DADAPTER9DRM_MINOR 1

				/* version 0.0: Initial release

				 *         0.1: All IDirect3D objects can be assumed to have a pointer to the

				 *              internal vtable in second position of the structure */

				struct D3DAdapter9DRM

				{

									
										7

include/d3dadapter/present.h
									
												View File
												
				@@ -71,6 +71,10 @@ typedef struct ID3DPresentVtbl

				    HRESULT (WINAPI *GetWindowInfo)(ID3DPresent *This,  HWND hWnd, int *width, int *height, int *depth);

				    /* Available since version 1.1 */

				    BOOL (WINAPI *GetWindowOccluded)(ID3DPresent *This);

				    /* Available since version 1.2 */

				    BOOL (WINAPI *ResolutionMismatch)(ID3DPresent *This);

				    HANDLE (WINAPI *CreateThread)(ID3DPresent *This, void *pThreadfunc, void *pParam);

				    BOOL (WINAPI *WaitForThread)(ID3DPresent *This, HANDLE thread);

				} ID3DPresentVtbl;

				struct ID3DPresent

				@@ -99,6 +103,9 @@ struct ID3DPresent

				#define ID3DPresent_SetGammaRamp(p,a,b) (p)->lpVtbl->SetGammaRamp(p,a,b)

				#define ID3DPresent_GetWindowInfo(p,a,b,c,d) (p)->lpVtbl->GetWindowSize(p,a,b,c,d)

				#define ID3DPresent_GetWindowOccluded(p) (p)->lpVtbl->GetWindowOccluded(p)

				#define ID3DPresent_ResolutionMismatch(p) (p)->lpVtbl->ResolutionMismatch(p)

				#define ID3DPresent_CreateThread(p,a,b) (p)->lpVtbl->CreateThread(p,a,b)

				#define ID3DPresent_WaitForThread(p,a) (p)->lpVtbl->WaitForThread(p,a)

				typedef struct ID3DPresentGroupVtbl

				{

									
										6

include/pci_ids/i965_pci_ids.h
									
												View File
												
				@@ -156,10 +156,12 @@ CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x22B0, chv,     "Intel(R) HD Graphics (Cherryview)")

				CHIPSET(0x22B1, chv,     "Intel(R) HD Graphics (Cherryview)")

				CHIPSET(0x22B0, chv,     "Intel(R) HD Graphics (Cherrytrail)")

				CHIPSET(0x22B1, chv,     "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */

				CHIPSET(0x22B2, chv,     "Intel(R) HD Graphics (Cherryview)")

				CHIPSET(0x22B3, chv,     "Intel(R) HD Graphics (Cherryview)")

				CHIPSET(0x0A84, bxt,     "Intel(R) HD Graphics (Broxton)")

				CHIPSET(0x1A84, bxt,     "Intel(R) HD Graphics (Broxton)")

				CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")

				CHIPSET(0x5A84, bxt,     "Intel(R) HD Graphics (Broxton)")

				CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")

									
										22

include/pci_ids/radeonsi_pci_ids.h
									
												View File
												
				@@ -182,4 +182,26 @@ CHIPSET(0x9877, CARRIZO_, CARRIZO)

				CHIPSET(0x7300, FIJI_, FIJI)

				CHIPSET(0x67E0, POLARIS11_, POLARIS11)

				CHIPSET(0x67E1, POLARIS11_, POLARIS11)

				CHIPSET(0x67E3, POLARIS11_, POLARIS11)

				CHIPSET(0x67E7, POLARIS11_, POLARIS11)

				CHIPSET(0x67E8, POLARIS11_, POLARIS11)

				CHIPSET(0x67E9, POLARIS11_, POLARIS11)

				CHIPSET(0x67EB, POLARIS11_, POLARIS11)

				CHIPSET(0x67EF, POLARIS11_, POLARIS11)

				CHIPSET(0x67FF, POLARIS11_, POLARIS11)

				CHIPSET(0x67C0, POLARIS10_, POLARIS10)

				CHIPSET(0x67C1, POLARIS10_, POLARIS10)

				CHIPSET(0x67C2, POLARIS10_, POLARIS10)

				CHIPSET(0x67C4, POLARIS10_, POLARIS10)

				CHIPSET(0x67C7, POLARIS10_, POLARIS10)

				CHIPSET(0x67C8, POLARIS10_, POLARIS10)

				CHIPSET(0x67C9, POLARIS10_, POLARIS10)

				CHIPSET(0x67CA, POLARIS10_, POLARIS10)

				CHIPSET(0x67CC, POLARIS10_, POLARIS10)

				CHIPSET(0x67CF, POLARIS10_, POLARIS10)

				CHIPSET(0x67DF, POLARIS10_, POLARIS10)

				CHIPSET(0x98E4, STONEY_, STONEY)

									
										1

include/pci_ids/virtio_gpu_pci_ids.h
									
												View File
												
				@@ -1 +1,2 @@

				CHIPSET(0x0010, VIRTGL, VIRTGL)

				CHIPSET(0x1050, VIRTGL, VIRTGL)

									
										85

include/vulkan/vk_icd.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,85 @@

				#ifndef VKICD_H

				#define VKICD_H

				#include "vk_platform.h"

				/*

				 * The ICD must reserve space for a pointer for the loader's dispatch

				 * table, at the start of <each object>.

				 * The ICD must initialize this variable using the SET_LOADER_MAGIC_VALUE macro.

				 */

				#define ICD_LOADER_MAGIC   0x01CDC0DE

				typedef union _VK_LOADER_DATA {

				  uintptr_t loaderMagic;

				  void *loaderData;

				} VK_LOADER_DATA;

				static inline void set_loader_magic_value(void* pNewObject) {

				    VK_LOADER_DATA *loader_info = (VK_LOADER_DATA *) pNewObject;

				    loader_info->loaderMagic = ICD_LOADER_MAGIC;

				}

				static inline bool valid_loader_magic_value(void* pNewObject) {

				    const VK_LOADER_DATA *loader_info = (VK_LOADER_DATA *) pNewObject;

				    return (loader_info->loaderMagic & 0xffffffff) == ICD_LOADER_MAGIC;

				}

				/*

				 * Windows and Linux ICDs will treat VkSurfaceKHR as a pointer to a struct that

				 * contains the platform-specific connection and surface information.

				 */

				typedef enum _VkIcdWsiPlatform {

				    VK_ICD_WSI_PLATFORM_MIR,

				    VK_ICD_WSI_PLATFORM_WAYLAND,

				    VK_ICD_WSI_PLATFORM_WIN32,

				    VK_ICD_WSI_PLATFORM_XCB,

				    VK_ICD_WSI_PLATFORM_XLIB,

				} VkIcdWsiPlatform;

				typedef struct _VkIcdSurfaceBase {

				    VkIcdWsiPlatform   platform;

				} VkIcdSurfaceBase;

				#ifdef VK_USE_PLATFORM_MIR_KHR

				typedef struct _VkIcdSurfaceMir {

				    VkIcdSurfaceBase   base;

				    MirConnection*     connection;

				    MirSurface*        mirSurface;

				} VkIcdSurfaceMir;

				#endif // VK_USE_PLATFORM_MIR_KHR

				#ifdef VK_USE_PLATFORM_WAYLAND_KHR

				typedef struct _VkIcdSurfaceWayland {

				    VkIcdSurfaceBase   base;

				    struct wl_display* display;

				    struct wl_surface* surface;

				} VkIcdSurfaceWayland;

				#endif // VK_USE_PLATFORM_WAYLAND_KHR

				#ifdef VK_USE_PLATFORM_WIN32_KHR

				typedef struct _VkIcdSurfaceWin32 {

				    VkIcdSurfaceBase   base;

				    HINSTANCE          hinstance;

				    HWND               hwnd;

				} VkIcdSurfaceWin32;

				#endif // VK_USE_PLATFORM_WIN32_KHR

				#ifdef VK_USE_PLATFORM_XCB_KHR

				typedef struct _VkIcdSurfaceXcb {

				    VkIcdSurfaceBase   base;

				    xcb_connection_t*  connection;

				    xcb_window_t       window;

				} VkIcdSurfaceXcb;

				#endif // VK_USE_PLATFORM_XCB_KHR

				#ifdef VK_USE_PLATFORM_XLIB_KHR

				typedef struct _VkIcdSurfaceXlib {

				    VkIcdSurfaceBase   base;

				    Display*           dpy;

				    Window             window;

				} VkIcdSurfaceXlib;

				#endif // VK_USE_PLATFORM_XLIB_KHR

				#endif // VKICD_H

									
										127

include/vulkan/vk_platform.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,127 @@

				//

				// File: vk_platform.h

				//

				/*

				** Copyright (c) 2014-2015 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				** "Materials"), to deal in the Materials without restriction, including

				** without limitation the rights to use, copy, modify, merge, publish,

				** distribute, sublicense, and/or sell copies of the Materials, and to

				** permit persons to whom the Materials are furnished to do so, subject to

				** the following conditions:

				**

				** The above copyright notice and this permission notice shall be included

				** in all copies or substantial portions of the Materials.

				**

				** THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				** EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				** MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				** IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

				** CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

				** TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

				** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				*/

				#ifndef VK_PLATFORM_H_

				#define VK_PLATFORM_H_

				#ifdef __cplusplus

				extern "C"

				{

				#endif // __cplusplus

				/*

				***************************************************************************************************

				*   Platform-specific directives and type declarations

				***************************************************************************************************

				*/

				/* Platform-specific calling convention macros.

				 *

				 * Platforms should define these so that Vulkan clients call Vulkan commands

				 * with the same calling conventions that the Vulkan implementation expects.

				 *

				 * VKAPI_ATTR - Placed before the return type in function declarations.

				 *              Useful for C++11 and GCC/Clang-style function attribute syntax.

				 * VKAPI_CALL - Placed after the return type in function declarations.

				 *              Useful for MSVC-style calling convention syntax.

				 * VKAPI_PTR  - Placed between the '(' and '*' in function pointer types.

				 *

				 * Function declaration:  VKAPI_ATTR void VKAPI_CALL vkCommand(void);

				 * Function pointer type: typedef void (VKAPI_PTR *PFN_vkCommand)(void);

				 */

				#if defined(_WIN32)

				    // On Windows, Vulkan commands use the stdcall convention

				    #define VKAPI_ATTR

				    #define VKAPI_CALL __stdcall

				    #define VKAPI_PTR  VKAPI_CALL

				#elif defined(__ANDROID__) && defined(__ARM_EABI__) && !defined(__ARM_ARCH_7A__)

				    // Android does not support Vulkan in native code using the "armeabi" ABI.

				    #error "Vulkan requires the 'armeabi-v7a' or 'armeabi-v7a-hard' ABI on 32-bit ARM CPUs"

				#elif defined(__ANDROID__) && defined(__ARM_ARCH_7A__)

				    // On Android/ARMv7a, Vulkan functions use the armeabi-v7a-hard calling

				    // convention, even if the application's native code is compiled with the

				    // armeabi-v7a calling convention.

				    #define VKAPI_ATTR __attribute__((pcs("aapcs-vfp")))

				    #define VKAPI_CALL

				    #define VKAPI_PTR  VKAPI_ATTR

				#else

				    // On other platforms, use the default calling convention

				    #define VKAPI_ATTR

				    #define VKAPI_CALL

				    #define VKAPI_PTR

				#endif

				#include <stddef.h>

				#if !defined(VK_NO_STDINT_H)

				    #if defined(_MSC_VER) && (_MSC_VER < 1600)

				        typedef signed   __int8  int8_t;

				        typedef unsigned __int8  uint8_t;

				        typedef signed   __int16 int16_t;

				        typedef unsigned __int16 uint16_t;

				        typedef signed   __int32 int32_t;

				        typedef unsigned __int32 uint32_t;

				        typedef signed   __int64 int64_t;

				        typedef unsigned __int64 uint64_t;

				    #else

				        #include <stdint.h>

				    #endif

				#endif // !defined(VK_NO_STDINT_H)

				#ifdef __cplusplus

				} // extern "C"

				#endif // __cplusplus

				// Platform-specific headers required by platform window system extensions.

				// These are enabled prior to #including "vulkan.h". The same enable then

				// controls inclusion of the extension interfaces in vulkan.h.

				#ifdef VK_USE_PLATFORM_ANDROID_KHR

				#include <android/native_window.h>

				#endif

				#ifdef VK_USE_PLATFORM_MIR_KHR

				#include <mir_toolkit/client_types.h>

				#endif

				#ifdef VK_USE_PLATFORM_WAYLAND_KHR

				#include <wayland-client.h>

				#endif

				#ifdef VK_USE_PLATFORM_WIN32_KHR

				#include <windows.h>

				#endif

				#ifdef VK_USE_PLATFORM_XLIB_KHR

				#include <X11/Xlib.h>

				#endif

				#ifdef VK_USE_PLATFORM_XCB_KHR

				#include <xcb/xcb.h>

				#endif

				#endif

3800

include/vulkan/vulkan.h Normal file

View File

File diff suppressed because it is too large Load Diff

									
										62

include/vulkan/vulkan_intel.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,62 @@

				/*

				 * Copyright © 2015 Intel Corporation

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				 * to deal in the Software without restriction, including without limitation

				 * the rights to use, copy, modify, merge, publish, distribute, sublicense,

				 * and/or sell copies of the Software, and to permit persons to whom the

				 * Software is furnished to do so, subject to the following conditions:

				 *

				 * The above copyright notice and this permission notice (including the next

				 * paragraph) shall be included in all copies or substantial portions of the

				 * Software.

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				 * IN THE SOFTWARE.

				 */

				#ifndef __VULKAN_INTEL_H__

				#define __VULKAN_INTEL_H__

				#include "vulkan.h"

				#ifdef __cplusplus

				extern "C"

				{

				#endif // __cplusplus

				#define VK_STRUCTURE_TYPE_DMA_BUF_IMAGE_CREATE_INFO_INTEL 1024

				typedef struct VkDmaBufImageCreateInfo_

				{

				    VkStructureType                             sType;                      // Must be VK_STRUCTURE_TYPE_DMA_BUF_IMAGE_CREATE_INFO_INTEL

				    const void*                                 pNext;                      // Pointer to next structure.

				    int                                         fd;

				    VkFormat                                    format;

				    VkExtent3D                                  extent;         // Depth must be 1

				    uint32_t                                    strideInBytes;

				} VkDmaBufImageCreateInfo;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateDmaBufImageINTEL)(VkDevice device, const VkDmaBufImageCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkDeviceMemory* pMem, VkImage* pImage);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateDmaBufImageINTEL(

				    VkDevice                                    _device,

				    const VkDmaBufImageCreateInfo*              pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkDeviceMemory*                             pMem,

				    VkImage*                                    pImage);

				#endif

				#ifdef __cplusplus

				} // extern "C"

				#endif // __cplusplus

				#endif // __VULKAN_INTEL_H__

									
										21

install-gallium-links.mk
									
												View File
												
				@@ -3,18 +3,18 @@

				if BUILD_SHARED

				if HAVE_COMPAT_SYMLINKS

				all-local : .libs/install-gallium-links

				all-local : .install-gallium-links

				.libs/install-gallium-links : $(dri_LTLIBRARIES) $(egl_LTLIBRARIES) $(lib_LTLIBRARIES)

				.install-gallium-links : $(dri_LTLIBRARIES) $(egl_LTLIBRARIES) $(lib_LTLIBRARIES)

					$(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR);	\

					link_dir=$(top_builddir)/$(LIB_DIR)/gallium;		\

					if test x$(egl_LTLIBRARIES) != x; then			\

						link_dir=$(top_builddir)/$(LIB_DIR)/egl;	\

					fi;							\

					$(MKDIR_P) $$link_dir;					\

					file_list=$(dri_LTLIBRARIES:%.la=.libs/%.so);		\

					file_list+=$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*);	\

					file_list+=$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*);	\

					file_list="$(dri_LTLIBRARIES:%.la=.libs/%.so)";		\

					file_list="$$file_list$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)";	\

					file_list="$$file_list$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)";	\

					for f in $$file_list; do 				\

						if test -h .libs/$$f; then			\

							cp -d $$f $$link_dir;			\

				@@ -23,4 +23,15 @@ all-local : .libs/install-gallium-links

						fi;						\

					done && touch $@

				endif

				clean-local:

					for f in $(notdir $(dri_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)) \

						 $(notdir $(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)) \

						 $(notdir $(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)); do \

						echo $$f; \

						$(RM) $(top_builddir)/$(LIB_DIR)/gallium/$$f;   \

					done;

					rmdir $(top_builddir)/$(LIB_DIR)/gallium || true

					$(RM) .install-gallium-links

				endif

7

m4/ax_gcc_func_attribute.m4

View File

@@ -53,6 +53,7 @@
 #    optimize
 #    packed
 #    pure
 #    returns_nonnull
 #    unused
 #    used
 #    visibility
@@ -76,6 +77,9 @@
 #serial 2
 # mattst88:
 #     Added support for returns_nonnull attribute
 AC_DEFUN([AX_GCC_FUNC_ATTRIBUTE], [
     AS_VAR_PUSHDEF([ac_var], [ax_cv_have_func_attribute_$1])
@@ -175,6 +179,9 @@ AC_DEFUN([AX_GCC_FUNC_ATTRIBUTE], [
                 [pure], [
                     int foo( void ) __attribute__(($1));
                 ],
                 [returns_nonnull], [
                     int *foo( void ) __attribute__(($1));
                 ],
                 [unused], [
                     int foo( void ) __attribute__(($1));
                 ],

									
										36

scons/custom.py
									
												View File
												
				@@ -30,11 +30,10 @@ Custom builders and methods.

				#

				import os

				import os.path

				import re

				import sys

				import subprocess

				import modulefinder

				import SCons.Action

				import SCons.Builder

				@@ -44,6 +43,13 @@ import fixes

				import source_list

				# the get_implicit_deps() method changed between 2.4 and 2.5: now it expects

				# a callable that takes a scanner as argument and returns a path, rather than

				# a path directly. We want to support both, so we need to detect the SCons version,

				# for which no API is provided by SCons 8-P

				scons_version = tuple(map(int, SCons.__version__.split('.')))

				def quietCommandLines(env):

				    # Quiet command lines

				    # See also http://www.scons.org/wiki/HidingCommandLinesInOutput

				@@ -93,27 +99,19 @@ def createConvenienceLibBuilder(env):

				    return convenience_lib

				# TODO: handle import statements with multiple modules

				# TODO: handle from import statements

				import_re = re.compile(r'^\s*import\s+(\S+)\s*$', re.M)

				def python_scan(node, env, path):

				    # http://www.scons.org/doc/0.98.5/HTML/scons-user/c2781.html#AEN2789

				    # https://docs.python.org/2/library/modulefinder.html

				    contents = node.get_contents()

				    source_dir = node.get_dir()

				    imports = import_re.findall(contents)

				    finder = modulefinder.ModuleFinder()

				    finder.run_script(node.abspath)

				    results = []

				    for imp in imports:

				        for dir in path:

				            file = os.path.join(str(dir), imp.replace('.', os.sep) + '.py')

				            if os.path.exists(file):

				                results.append(env.File(file))

				                break

				            file = os.path.join(str(dir), imp.replace('.', os.sep), '__init__.py')

				            if os.path.exists(file):

				                results.append(env.File(file))

				                break

				    #print node, map(str, results)

				    for name, mod in finder.modules.iteritems():

				        if mod.__file__ is None:

				            continue

				        assert os.path.exists(mod.__file__)

				        results.append(env.File(mod.__file__))

				    return results

				python_scanner = SCons.Scanner.Scanner(function = python_scan, skeys = ['.py'])

				@@ -138,7 +136,7 @@ def code_generate(env, script, target, source, command):

				    # Explicitly mark that the generated code depends on the generator,

				    # and on implicitly imported python modules

				    path = (script_src.get_dir(),)

				    path = (script_src.get_dir(),) if scons_version < (2, 5, 0) else lambda x: script_src

				    deps = [script_src]

				    deps += script_src.get_implicit_deps(env, python_scanner, path)

				    env.Depends(code, deps)

									
										110

scons/gallium.py
									
												View File
												
				@@ -82,11 +82,6 @@ def install_shared_library(env, sources, version = ()):

				    return targets

				def createInstallMethods(env):

				    env.AddMethod(install_program, 'InstallProgram')

				    env.AddMethod(install_shared_library, 'InstallSharedLibrary')

				def msvc2013_compat(env):

				    if env['gcc']:

				        env.Append(CCFLAGS = [

				@@ -94,8 +89,20 @@ def msvc2013_compat(env):

				            '-Werror=pointer-arith',

				        ])

				def createMSVCCompatMethods(env):

				    env.AddMethod(msvc2013_compat, 'MSVC2013Compat')

				def unit_test(env, test_name, program_target, args=None):

				    env.InstallProgram(program_target)

				    cmd = [program_target[0].abspath]

				    if args is not None:

				        cmd += args

				    cmd = ' '.join(cmd)

				    # http://www.scons.org/wiki/UnitTests

				    action = SCons.Action.Action(cmd, "  Running $SOURCE ...")

				    alias = env.Alias(test_name, program_target, action)

				    env.AlwaysBuild(alias)

				    env.Depends('check', alias)

				def num_jobs():

				@@ -164,16 +171,6 @@ def generate(env):

				    # Allow override compiler and specify additional flags from environment

				    if os.environ.has_key('CC'):

				        env['CC'] = os.environ['CC']

				        # Update CCVERSION to match

				        pipe = SCons.Action._subproc(env, [env['CC'], '--version'],

				                                     stdin = 'devnull',

				                                     stderr = 'devnull',

				                                     stdout = subprocess.PIPE)

				        if pipe.wait() == 0:

				            line = pipe.stdout.readline()

				            match = re.search(r'[0-9]+(\.[0-9]+)+', line)

				            if match:

				                env['CCVERSION'] = match.group(0)

				    if os.environ.has_key('CFLAGS'):

				        env['CCFLAGS'] += SCons.Util.CLVar(os.environ['CFLAGS'])

				    if os.environ.has_key('CXX'):

				@@ -186,14 +183,15 @@ def generate(env):

				    # Detect gcc/clang not by executable name, but through pre-defined macros

				    # as autoconf does, to avoid drawing wrong conclusions when using tools

				    # that overrice CC/CXX like scan-build.

				    env['gcc'] = 0

				    env['gcc_compat'] = 0

				    env['clang'] = 0

				    env['msvc'] = 0

				    if host_platform.system() == 'Windows':

				        env['msvc'] = check_cc(env, 'MSVC', 'defined(_MSC_VER)', '/E')

				    if not env['msvc']:

				        env['gcc'] = check_cc(env, 'GCC', 'defined(__GNUC__) && !defined(__clang__)')

				        env['clang'] = check_cc(env, 'Clang', '__clang__')

				        env['gcc_compat'] = check_cc(env, 'GCC', 'defined(__GNUC__)')

				    env['clang'] = check_cc(env, 'Clang', '__clang__')

				    env['gcc'] = env['gcc_compat'] and not env['clang']

				    env['suncc'] = env['platform'] == 'sunos' and os.path.basename(env['CC']) == 'cc'

				    env['icc'] = 'icc' == os.path.basename(env['CC'])

				@@ -206,7 +204,7 @@ def generate(env):

				    platform = env['platform']

				    x86 = env['machine'] == 'x86'

				    ppc = env['machine'] == 'ppc'

				    gcc_compat = env['gcc'] or env['clang']

				    gcc_compat = env['gcc_compat']

				    msvc = env['msvc']

				    suncc = env['suncc']

				    icc = env['icc']

				@@ -292,7 +290,11 @@ def generate(env):

				    # C preprocessor options

				    cppdefines = []

				    cppdefines += ['__STDC_LIMIT_MACROS', '__STDC_CONSTANT_MACROS']

				    cppdefines += [

				        '__STDC_LIMIT_MACROS',

				        '__STDC_CONSTANT_MACROS',

				        'HAVE_NO_AUTOCONF',

				    ]

				    if env['build'] in ('debug', 'checked'):

				        cppdefines += ['DEBUG']

				    else:

				@@ -307,8 +309,6 @@ def generate(env):

				            '_BSD_SOURCE',

				            '_GNU_SOURCE',

				            '_DEFAULT_SOURCE',

				            'HAVE_PTHREAD',

				            'HAVE_POSIX_MEMALIGN',

				        ]

				        if env['platform'] == 'darwin':

				            cppdefines += [

				@@ -329,11 +329,6 @@ def generate(env):

				        if env['platform'] in ('linux', 'darwin'):

				            cppdefines += ['HAVE_XLOCALE_H']

				    if env['platform'] == 'haiku':

				        cppdefines += [

				            'HAVE_PTHREAD',

				            'HAVE_POSIX_MEMALIGN'

				        ]

				    if platform == 'windows':

				        cppdefines += [

				            'WIN32',

				@@ -367,26 +362,6 @@ def generate(env):

				        print 'warning: Floating-point textures enabled.'

				        print 'warning: Please consult docs/patents.txt with your lawyer before building Mesa.'

				        cppdefines += ['TEXTURE_FLOAT_ENABLED']

				    if gcc_compat:

				        ccversion = env['CCVERSION']

				        cppdefines += [

				            'HAVE___BUILTIN_EXPECT',

				            'HAVE___BUILTIN_FFS',

				            'HAVE___BUILTIN_FFSLL',

				            'HAVE_FUNC_ATTRIBUTE_FLATTEN',

				            'HAVE_FUNC_ATTRIBUTE_UNUSED',

				            # GCC 3.0

				            'HAVE_FUNC_ATTRIBUTE_FORMAT',

				            'HAVE_FUNC_ATTRIBUTE_PACKED',

				            # GCC 3.4

				            'HAVE___BUILTIN_CTZ',

				            'HAVE___BUILTIN_POPCOUNT',

				            'HAVE___BUILTIN_POPCOUNTLL',

				            'HAVE___BUILTIN_CLZ',

				            'HAVE___BUILTIN_CLZLL',

				        ]

				        if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('4.5'):

				            cppdefines += ['HAVE___BUILTIN_UNREACHABLE']

				    env.Append(CPPDEFINES = cppdefines)

				    # C compiler options

				@@ -394,13 +369,8 @@ def generate(env):

				    cxxflags = [] # C++

				    ccflags = [] # C & C++

				    if gcc_compat:

				        ccversion = env['CCVERSION']

				        if env['build'] == 'debug':

				            ccflags += ['-O0']

				        elif env['gcc'] and ccversion.startswith('4.2.'):

				            # gcc 4.2.x optimizer is broken

				            print "warning: gcc 4.2.x optimizer is broken -- disabling optimizations"

				            ccflags += ['-O0']

				        else:

				            ccflags += ['-O3']

				        if env['gcc']:

				@@ -410,7 +380,7 @@ def generate(env):

				        # Work around aliasing bugs - developers should comment this out

				        ccflags += ['-fno-strict-aliasing']

				        ccflags += ['-g']

				        if env['build'] in ('checked', 'profile'):

				        if env['build'] in ('checked', 'profile') or env['asan']:

				            # See http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#Which_options_should_I_pass_to_gcc_when_compiling_for_profiling?

				            ccflags += [

				                '-fno-omit-frame-pointer',

				@@ -481,13 +451,13 @@ def generate(env):

				                '/O2', # optimize for speed

				            ]

				        if env['build'] == 'release':

				            ccflags += [

				                '/GL', # enable whole program optimization

				            ]

				            if not env['clang']:

				                ccflags += [

				                    '/GL', # enable whole program optimization

				                ]

				        else:

				            ccflags += [

				                '/Oy-', # disable frame pointer omission

				                '/GL-', # disable whole program optimization

				            ]

				        ccflags += [

				            '/W3', # warning level

				@@ -501,6 +471,10 @@ def generate(env):

				            '/wd4800', # forcing value to bool 'true' or 'false' (performance warning)

				            '/wd4996', # disable deprecated POSIX name warnings

				        ]

				        if env['clang']:

				            ccflags += [

				                '-Wno-microsoft-enum-value', # enumerator value is not representable in underlying type 'int'

				            ]

				        if env['machine'] == 'x86':

				            ccflags += [

				                '/arch:SSE2', # use the SSE2 instructions (default since MSVC 2012)

				@@ -540,6 +514,16 @@ def generate(env):

				            # scan-build will produce more comprehensive output

				            env.Append(CCFLAGS = ['--analyze'])

				    # https://github.com/google/sanitizers/wiki/AddressSanitizer

				    if env['asan']:

				        if gcc_compat:

				            env.Append(CCFLAGS = [

				                '-fsanitize=address',

				            ])

				            env.Append(LINKFLAGS = [

				                '-fsanitize=address',

				            ])

				    # Assembler options

				    if gcc_compat:

				        if env['machine'] == 'x86':

				@@ -577,7 +561,7 @@ def generate(env):

				            shlinkflags += ['-Wl,--enable-stdcall-fixup']

				            #shlinkflags += ['-Wl,--kill-at']

				    if msvc:

				        if env['build'] == 'release':

				        if env['build'] == 'release' and not env['clang']:

				            # enable Link-time Code Generation

				            linkflags += ['/LTCG']

				            env.Append(ARFLAGS = ['/LTCG'])

				@@ -657,8 +641,10 @@ def generate(env):

				    # Custom builders and methods

				    env.Tool('custom')

				    createInstallMethods(env)

				    createMSVCCompatMethods(env)

				    env.AddMethod(install_program, 'InstallProgram')

				    env.AddMethod(install_shared_library, 'InstallSharedLibrary')

				    env.AddMethod(msvc2013_compat, 'MSVC2013Compat')

				    env.AddMethod(unit_test, 'UnitTest')

				    env.PkgCheckModules('X11', ['x11', 'xext', 'xdamage', 'xfixes', 'glproto >= 1.4.13'])

				    env.PkgCheckModules('XCB', ['x11-xcb', 'xcb-glx >= 1.8.1', 'xcb-dri2 >= 1.8'])

2301

scripts/get_reviewer.pl Executable file

View File

File diff suppressed because it is too large Load Diff

									
										73

src/Makefile.am
									
												View File
												
				@@ -19,11 +19,65 @@

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				.PHONY: git_sha1.h.tmp

				git_sha1.h.tmp:

					@# Don't assume that $(top_srcdir)/.git is a directory. It may be

					@# a gitlink file if $(top_srcdir) is a submodule checkout or a linked

					@# worktree.

					@# If we are building from a release tarball copy the bundled header.

					@touch git_sha1.h.tmp

					@if test -e $(top_srcdir)/.git; then \

						if which git > /dev/null; then \

						    git --git-dir=$(top_srcdir)/.git log -n 1 --oneline | \

							sed 's/^\([^ ]*\) .*/#define MESA_GIT_SHA1 "git-\1"/' \

							> git_sha1.h.tmp ; \

						fi \

					fi

				git_sha1.h: git_sha1.h.tmp

					@echo "updating git_sha1.h"

					@if ! cmp -s git_sha1.h.tmp git_sha1.h; then \

						mv git_sha1.h.tmp git_sha1.h ;\

					else \

						rm git_sha1.h.tmp ;\

					fi

				BUILT_SOURCES = git_sha1.h

				CLEANFILES = $(BUILT_SOURCES)

				SUBDIRS = . gtest util mapi/glapi/gen mapi

				if HAVE_OPENGL

				gldir = $(includedir)/GL

				gl_HEADERS = \

				  $(top_srcdir)/include/GL/gl.h \

				  $(top_srcdir)/include/GL/glext.h \

				  $(top_srcdir)/include/GL/glcorearb.h \

				  $(top_srcdir)/include/GL/gl_mangle.h

				endif

				if HAVE_GLX

				glxdir = $(includedir)/GL

				glx_HEADERS = \

				  $(top_srcdir)/include/GL/glx.h \

				  $(top_srcdir)/include/GL/glxext.h \

				  $(top_srcdir)/include/GL/glx_mangle.h

				pkgconfigdir = $(libdir)/pkgconfig

				pkgconfig_DATA = mesa/gl.pc

				endif

				if HAVE_COMMON_OSMESA

				osmesadir = $(includedir)/GL

				osmesa_HEADERS = $(top_srcdir)/include/GL/osmesa.h

				endif

				# include only conditionally ?

				SUBDIRS += compiler

				if HAVE_INTEL_DRIVERS

				SUBDIRS += intel

				endif

				if NEED_OPENGL_COMMON

				SUBDIRS += mesa

				endif

				@@ -34,24 +88,37 @@ if HAVE_DRI_GLX

				SUBDIRS += glx

				endif

				if HAVE_EGL_PLATFORM_WAYLAND

				SUBDIRS += egl/wayland/wayland-egl egl/wayland/wayland-drm

				## Optionally required by GBM and EGL

				if HAVE_PLATFORM_WAYLAND

				SUBDIRS += egl/wayland/wayland-drm

				endif

				## Optionally required by EGL (aka PLATFORM_GBM)

				if HAVE_GBM

				SUBDIRS += gbm

				endif

				## Optionally required by EGL

				if HAVE_PLATFORM_WAYLAND

				SUBDIRS += egl/wayland/wayland-egl

				endif

				if HAVE_EGL

				SUBDIRS += egl

				endif

				## Requires the i965 compiler (part of mesa) and wayland-drm

				if HAVE_INTEL_VULKAN

				SUBDIRS += intel/vulkan

				endif

				if HAVE_GALLIUM

				SUBDIRS += gallium

				endif

				EXTRA_DIST = \

					getopt hgl SConscript

					getopt hgl SConscript \

					$(top_srcdir)/include/GL/mesa_glinterop.h

				AM_CFLAGS = $(VISIBILITY_CFLAGS)

				AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)

4

src/compiler/.gitignore vendored

View File

@@ -1 +1,5 @@
 glsl_compiler
 subtest-cr
 subtest-cr-lf
 subtest-lf
 subtest-lf-cr

									
										16

src/compiler/glsl/Android.gen.mk → src/compiler/Android.glsl.gen.mk
									
												View File
												
				@@ -32,8 +32,10 @@ intermediates := $(call local-generated-sources-dir)

				LOCAL_SRC_FILES := $(LOCAL_SRC_FILES)

				LOCAL_C_INCLUDES += \

					$(intermediates)/glcpp \

					$(MESA_TOP)/src/glsl/glcpp \

					$(intermediates)/glsl \

					$(intermediates)/glsl/glcpp \

					$(LOCAL_PATH)/glsl \

					$(LOCAL_PATH)/glsl/glcpp \

				LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \

					$(LIBGLCPP_GENERATED_FILES) \

				@@ -51,6 +53,8 @@ define glsl_local-y-to-c-and-h

					$(hide) $(YACC) -o $@ -p "glcpp_parser_" $<

				endef

				YACC_HEADER_SUFFIX := .hpp

				define local-yy-to-cpp-and-h

					@mkdir -p $(dir $@)

					@echo "Mesa Yacc: $(PRIVATE_MODULE) <= $<"

				@@ -63,14 +67,14 @@ define local-yy-to-cpp-and-h

					rm -f $(@:$1=$(YACC_HEADER_SUFFIX))

				endef

				$(intermediates)/glsl_lexer.cpp: $(LOCAL_PATH)/glsl_lexer.ll

				$(intermediates)/glsl/glsl_lexer.cpp: $(LOCAL_PATH)/glsl/glsl_lexer.ll

					$(call local-l-or-ll-to-c-or-cpp)

				$(intermediates)/glsl_parser.cpp: $(LOCAL_PATH)/glsl_parser.yy

				$(intermediates)/glsl/glsl_parser.cpp: $(LOCAL_PATH)/glsl/glsl_parser.yy

					$(call local-yy-to-cpp-and-h,.cpp)

				$(intermediates)/glcpp/glcpp-lex.c: $(LOCAL_PATH)/glcpp/glcpp-lex.l

				$(intermediates)/glsl/glcpp/glcpp-lex.c: $(LOCAL_PATH)/glsl/glcpp/glcpp-lex.l

					$(call local-l-or-ll-to-c-or-cpp)

				$(intermediates)/glcpp/glcpp-parse.c: $(LOCAL_PATH)/glcpp/glcpp-parse.y

				$(intermediates)/glsl/glcpp/glcpp-parse.c: $(LOCAL_PATH)/glsl/glcpp/glcpp-parse.y

					$(call glsl_local-y-to-c-and-h)

									
										29

src/compiler/glsl/Android.mk → src/compiler/Android.glsl.mk
									
												View File
												
				@@ -36,7 +36,6 @@ include $(CLEAR_VARS)

				LOCAL_SRC_FILES := \

					$(LIBGLCPP_FILES) \

					$(LIBGLSL_FILES) \

					$(NIR_FILES)

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/mapi \

				@@ -44,33 +43,13 @@ LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_STATIC_LIBRARIES := libmesa_compiler

				LOCAL_STATIC_LIBRARIES := \

					libmesa_compiler \

					libmesa_nir

				LOCAL_MODULE := libmesa_glsl

				include $(LOCAL_PATH)/Android.gen.mk

				include $(LOCAL_PATH)/Android.glsl.gen.mk

				include $(MESA_COMMON_MK)

				include $(BUILD_STATIC_LIBRARY)

				# ---------------------------------------

				# Build glsl_compiler

				# ---------------------------------------

				include $(CLEAR_VARS)

				LOCAL_SRC_FILES := \

					$(GLSL_COMPILER_CXX_FILES)

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/mapi \

					$(MESA_TOP)/src/mesa \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_STATIC_LIBRARIES := libmesa_glsl libmesa_glsl_utils libmesa_util

				LOCAL_MODULE_TAGS := eng

				LOCAL_MODULE := glsl_compiler

				include $(MESA_COMMON_MK)

				include $(BUILD_EXECUTABLE)

									
										23

src/compiler/Android.mk
									
												View File
												
				@@ -43,25 +43,6 @@ LOCAL_MODULE := libmesa_compiler

				include $(MESA_COMMON_MK)

				include $(BUILD_STATIC_LIBRARY)

				# ---------------------------------------

				# Build libmesa_nir

				# ---------------------------------------

				include $(LOCAL_PATH)/Android.glsl.mk

				include $(CLEAR_VARS)

				LOCAL_SRC_FILES := \

					$(NIR_FILES)

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/mapi \

					$(MESA_TOP)/src/mesa \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_STATIC_LIBRARIES := libmesa_compiler

				LOCAL_MODULE := libmesa_nir

				include $(LOCAL_PATH)/Android.gen.mk

				include $(MESA_COMMON_MK)

				include $(BUILD_STATIC_LIBRARY)

				include $(LOCAL_PATH)/Android.nir.mk

									
										4

src/compiler/Android.gen.mk → src/compiler/Android.nir.gen.mk
									
												View File
												
				@@ -42,6 +42,10 @@ LOCAL_EXPORT_C_INCLUDE_DIRS += \

				LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \

					$(NIR_GENERATED_FILES))

				# Modules using libmesa_nir must set LOCAL_GENERATED_SOURCES to this

				MESA_GEN_NIR_H := $(addprefix $(call local-generated-sources-dir)/, \

					nir/nir_opcodes.h \

					nir/nir_builder_opcodes.h)

				nir_builder_opcodes_gen := $(LOCAL_PATH)/nir/nir_builder_opcodes_h.py

				nir_builder_opcodes_deps := \

									
										49

src/compiler/Android.nir.mk
									
										Normal file
									
												View File
												
				@@ -0,0 +1,49 @@

				# Mesa 3-D graphics library

				#

				# Copyright (C) 2015 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice shall be included

				# in all copies or substantial portions of the Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

				# DEALINGS IN THE SOFTWARE.

				LOCAL_PATH := $(call my-dir)

				include $(LOCAL_PATH)/Makefile.sources

				# ---------------------------------------

				# Build libmesa_nir

				# ---------------------------------------

				include $(CLEAR_VARS)

				LOCAL_SRC_FILES := \

					$(NIR_FILES)

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/mapi \

					$(MESA_TOP)/src/mesa \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_STATIC_LIBRARIES := libmesa_compiler

				LOCAL_MODULE := libmesa_nir

				include $(LOCAL_PATH)/Android.nir.gen.mk

				include $(MESA_COMMON_MK)

				include $(BUILD_STATIC_LIBRARY)

									
										270

src/compiler/Makefile.am
									
												View File
												
				@@ -31,6 +31,8 @@ AM_CPPFLAGS = \

					-I$(top_builddir)/src/compiler/glsl\

					-I$(top_srcdir)/src/compiler/glsl\

					-I$(top_srcdir)/src/compiler/glsl/glcpp\

					-I$(top_builddir)/src/compiler/nir \

					-I$(top_srcdir)/src/compiler/nir \

					-I$(top_srcdir)/src/gallium/include \

					-I$(top_srcdir)/src/gallium/auxiliary \

					-I$(top_srcdir)/src/gtest/include \

				@@ -54,272 +56,8 @@ BUILT_SOURCES =

				CLEANFILES =

				EXTRA_DIST = SConscript

				EXTRA_DIST += glsl/tests glsl/glcpp/tests glsl/README	\

					glsl/TODO glsl/glcpp/README			\

					glsl/glsl_lexer.ll				\

					glsl/glsl_parser.yy				\

					glsl/glcpp/glcpp-lex.l				\

					glsl/glcpp/glcpp-parse.y			\

					glsl/Makefile.sources				\

					glsl/SConscript

				TESTS += glsl/glcpp/tests/glcpp-test			\

					glsl/glcpp/tests/glcpp-test-cr-lf		\

					glsl/tests/blob-test				\

					glsl/tests/general-ir-test			\

					glsl/tests/optimization-test			\

					glsl/tests/sampler-types-test			\

					glsl/tests/uniform-initializer-test

				TESTS_ENVIRONMENT= \

					export PYTHON2=$(PYTHON2); \

					export PYTHON_FLAGS=$(PYTHON_FLAGS);

				check_PROGRAMS +=					\

					glsl/glcpp/glcpp				\

					glsl/glsl_test					\

					glsl/tests/blob-test				\

					glsl/tests/general-ir-test			\

					glsl/tests/sampler-types-test			\

					glsl/tests/uniform-initializer-test

				noinst_PROGRAMS = glsl_compiler

				glsl_tests_blob_test_SOURCES =				\

					glsl/tests/blob_test.c

				glsl_tests_blob_test_LDADD =				\

					glsl/libglsl.la

				glsl_tests_general_ir_test_SOURCES =			\

					glsl/standalone_scaffolding.cpp			\

					glsl/tests/builtin_variable_test.cpp		\

					glsl/tests/invalidate_locations_test.cpp	\

					glsl/tests/general_ir_test.cpp			\

					glsl/tests/varyings_test.cpp

				glsl_tests_general_ir_test_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				glsl_tests_general_ir_test_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					glsl/libglsl.la		\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				glsl_tests_uniform_initializer_test_SOURCES =		\

					glsl/tests/copy_constant_to_storage_tests.cpp	\

					glsl/tests/set_uniform_initializer_tests.cpp	\

					glsl/tests/uniform_initializer_utils.cpp	\

					glsl/tests/uniform_initializer_utils.h

				glsl_tests_uniform_initializer_test_CFLAGS =		\

					$(PTHREAD_CFLAGS)

				glsl_tests_uniform_initializer_test_LDADD =		\

					$(top_builddir)/src/gtest/libgtest.la		\

					glsl/libglsl.la		\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				glsl_tests_sampler_types_test_SOURCES =			\

					glsl/tests/sampler_types_test.cpp

				glsl_tests_sampler_types_test_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				glsl_tests_sampler_types_test_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					glsl/libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				noinst_LTLIBRARIES += glsl/libglsl.la glsl/libglcpp.la

				glsl_libglcpp_la_LIBADD =				\

					$(top_builddir)/src/util/libmesautil.la

				glsl_libglcpp_la_SOURCES =				\

					glsl/glcpp/glcpp-lex.c				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-parse.h			\

					$(LIBGLCPP_FILES)

				glsl_glcpp_glcpp_SOURCES =				\

					glsl/glcpp/glcpp.c

				glsl_glcpp_glcpp_LDADD =				\

					glsl/libglcpp.la	\

					$(top_builddir)/src/libglsl_util.la		\

					-lm

				glsl_libglsl_la_LIBADD = \

					nir/libnir.la \

					glsl/libglcpp.la

				glsl_libglsl_la_SOURCES =				\

					glsl/glsl_lexer.cpp				\

					glsl/glsl_parser.cpp				\

					glsl/glsl_parser.h				\

					$(LIBGLSL_FILES)

				glsl_compiler_SOURCES = \

					$(GLSL_COMPILER_CXX_FILES)

				glsl_compiler_LDADD =					\

					glsl/libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(top_builddir)/src/util/libmesautil.la		\

					$(PTHREAD_LIBS)

				glsl_glsl_test_SOURCES = \

					glsl/standalone_scaffolding.cpp \

					glsl/test.cpp \

					glsl/test_optpass.cpp \

					glsl/test_optpass.h

				glsl_glsl_test_LDADD =					\

					glsl/libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				# We write our own rules for yacc and lex below. We'd rather use automake,

				# but automake makes it especially difficult for a number of reasons:

				#

				#  * < automake-1.12 generates .h files from .yy and .ypp files, but

				#    >=automake-1.12 generates .hh and .hpp files respectively. There's no

				#    good way of making a project that uses C++ yacc files compatible with

				#    both versions of automake. Strong work automake developers.

				#

				#  * Since we're generating code from .l/.y files in a subdirectory (glcpp/)

				#    we'd like the resulting generated code to also go in glcpp/ for purposes

				#    of distribution. Automake gives no way to do this.

				#

				#  * Since we're building multiple yacc parsers into one library (and via one

				#    Makefile) we have to use per-target YFLAGS. Using per-target YFLAGS causes

				#    automake to name the resulting generated code as <library-name>_filename.c.

				#    Frankly, that's ugly and we don't want a libglcpp_glcpp_parser.h file.

				# In order to make build output print "LEX" and "YACC", we reproduce the

				# automake variables below.

				AM_V_LEX = $(am__v_LEX_$(V))

				am__v_LEX_ = $(am__v_LEX_$(AM_DEFAULT_VERBOSITY))

				am__v_LEX_0 = @echo "  LEX     " $@;

				am__v_LEX_1 =

				AM_V_YACC = $(am__v_YACC_$(V))

				am__v_YACC_ = $(am__v_YACC_$(AM_DEFAULT_VERBOSITY))

				am__v_YACC_0 = @echo "  YACC    " $@;

				am__v_YACC_1 =

				MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)

				YACC_GEN = $(AM_V_YACC)$(YACC) $(YFLAGS)

				LEX_GEN = $(AM_V_LEX)$(LEX) $(LFLAGS)

				glsl/glsl_parser.cpp glsl/glsl_parser.h: glsl/glsl_parser.yy

					$(MKDIR_GEN)

					$(YACC_GEN) -o $@ -p "_mesa_glsl_" --defines=$(builddir)/glsl/glsl_parser.h $(srcdir)/glsl/glsl_parser.yy

				include Makefile.glsl.am

				glsl/glsl_lexer.cpp: glsl/glsl_lexer.ll

					$(MKDIR_GEN)

					$(LEX_GEN) -o $@ $(srcdir)/glsl/glsl_lexer.ll

				glsl/glcpp/glcpp-parse.c glsl/glcpp/glcpp-parse.h: glsl/glcpp/glcpp-parse.y

					$(MKDIR_GEN)

					$(YACC_GEN) -o $@ -p "glcpp_parser_" --defines=$(builddir)/glsl/glcpp/glcpp-parse.h $(srcdir)/glsl/glcpp/glcpp-parse.y

				glsl/glcpp/glcpp-lex.c: glsl/glcpp/glcpp-lex.l

					$(MKDIR_GEN)

					$(LEX_GEN) -o $@ $(srcdir)/glsl/glcpp/glcpp-lex.l

				# Only the parsers (specifically the header files generated at the same time)

				# need to be in BUILT_SOURCES. Though if we list the parser headers YACC is

				# called for the .c/.cpp file and the .h files. By listing the .c/.cpp files

				# YACC is only executed once for each parser. The rest of the generated code

				# will be created at the appropriate times according to standard automake

				# dependency rules.

				BUILT_SOURCES +=					\

					glsl/glsl_parser.cpp				\

					glsl/glsl_lexer.cpp				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-lex.c

				CLEANFILES +=						\

					glsl/glcpp/glcpp-parse.h			\

					glsl/glsl_parser.h				\

					glsl/glsl_parser.cpp				\

					glsl/glsl_lexer.cpp				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-lex.c

				clean-local:

					$(RM) -r subtest-cr subtest-cr-lf subtest-lf subtest-lf-cr

				dist-hook:

					$(RM) glsl/glcpp/tests/*.out

					$(RM) glsl/glcpp/tests/subtest*/*.out

				noinst_LTLIBRARIES += nir/libnir.la

				nir_libnir_la_CPPFLAGS = \

					$(AM_CPPFLAGS) \

					-I$(top_builddir)/src/compiler/nir \

					-I$(top_srcdir)/src/compiler/nir

				nir_libnir_la_LIBADD = \

					libcompiler.la

				nir_libnir_la_SOURCES =					\

					$(NIR_FILES)					\

					$(NIR_GENERATED_FILES)

				PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)

				nir/nir_builder_opcodes.h: nir/nir_opcodes.py nir/nir_builder_opcodes_h.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_builder_opcodes_h.py > $@ || ($(RM) $@; false)

				nir/nir_constant_expressions.c: nir/nir_opcodes.py nir/nir_constant_expressions.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_constant_expressions.py > $@ || ($(RM) $@; false)

				nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_h.py > $@ || ($(RM) $@; false)

				nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_c.py > $@ || ($(RM) $@; false)

				nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opt_algebraic.py > $@ || ($(RM) $@; false)

				check_PROGRAMS += nir/tests/control_flow_tests

				nir_tests_control_flow_tests_CPPFLAGS = \

					$(AM_CPPFLAGS) \

					-I$(top_builddir)/src/compiler/nir \

					-I$(top_srcdir)/src/compiler/nir

				nir_tests_control_flow_tests_SOURCES =			\

					nir/tests/control_flow_tests.cpp

				nir_tests_control_flow_tests_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				nir_tests_control_flow_tests_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					nir/libnir.la	\

					$(top_builddir)/src/util/libmesautil.la		\

					$(PTHREAD_LIBS)

				TESTS += nir/tests/control_flow_tests

				BUILT_SOURCES += $(NIR_GENERATED_FILES)

				CLEANFILES += $(NIR_GENERATED_FILES)

				EXTRA_DIST += \

					nir/nir_algebraic.py				\

					nir/nir_builder_opcodes_h.py			\

					nir/nir_constant_expressions.py			\

					nir/nir_opcodes.py				\

					nir/nir_opcodes_c.py				\

					nir/nir_opcodes_h.py				\

					nir/nir_opt_algebraic.py			\

					nir/tests					\

					nir/Makefile.sources

				include Makefile.nir.am

									
										224

src/compiler/glsl/Makefile.am → src/compiler/Makefile.glsl.am
									
												View File
												
				@@ -1,4 +1,6 @@

				#

				# Copyright © 2012 Jon TURNEY

				# Copyright (C) 2015 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				@@ -19,140 +21,130 @@

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				AM_CPPFLAGS = \

					-I$(top_srcdir)/include \

					-I$(top_srcdir)/src \

					-I$(top_srcdir)/src/mapi \

					-I$(top_srcdir)/src/mesa/ \

					-I$(top_srcdir)/src/gallium/include \

					-I$(top_srcdir)/src/gallium/auxiliary \

					-I$(top_srcdir)/src/glsl/glcpp \

					-I$(top_srcdir)/src/gtest/include \

					$(DEFINES)

				AM_CFLAGS = \

					$(VISIBILITY_CFLAGS) \

					$(MSVC2013_COMPAT_CFLAGS)

				AM_CXXFLAGS = \

					$(VISIBILITY_CXXFLAGS) \

					$(MSVC2013_COMPAT_CXXFLAGS)

				EXTRA_DIST += glsl/tests glsl/glcpp/tests glsl/README	\

					glsl/TODO glsl/glcpp/README			\

					glsl/glsl_lexer.ll				\

					glsl/glsl_parser.yy				\

					glsl/glcpp/glcpp-lex.l				\

					glsl/glcpp/glcpp-parse.y			\

					SConscript.glsl

				EXTRA_DIST = tests glcpp/tests README TODO glcpp/README	\

					glsl_lexer.ll					\

					glsl_parser.yy					\

					glcpp/glcpp-lex.l				\

					glcpp/glcpp-parse.y				\

					SConscript

				include Makefile.sources

				TESTS = glcpp/tests/glcpp-test				\

					glcpp/tests/glcpp-test-cr-lf			\

					tests/blob-test					\

					tests/general-ir-test				\

					tests/optimization-test				\

					tests/sampler-types-test                        \

					tests/uniform-initializer-test

				TESTS += glsl/glcpp/tests/glcpp-test			\

					glsl/glcpp/tests/glcpp-test-cr-lf		\

					glsl/tests/blob-test				\

					glsl/tests/general-ir-test			\

					glsl/tests/optimization-test			\

					glsl/tests/sampler-types-test			\

					glsl/tests/uniform-initializer-test             \

					glsl/tests/warnings-test

				TESTS_ENVIRONMENT= \

					export PYTHON2=$(PYTHON2); \

					export PYTHON_FLAGS=$(PYTHON_FLAGS);

				noinst_LTLIBRARIES = libglsl.la libglcpp.la

				check_PROGRAMS =					\

					glcpp/glcpp					\

					glsl_test					\

					tests/blob-test					\

					tests/general-ir-test				\

					tests/sampler-types-test			\

					tests/uniform-initializer-test

				check_PROGRAMS +=					\

					glsl/glcpp/glcpp				\

					glsl/glsl_test					\

					glsl/tests/blob-test				\

					glsl/tests/general-ir-test			\

					glsl/tests/sampler-types-test			\

					glsl/tests/uniform-initializer-test

				noinst_PROGRAMS = glsl_compiler

				tests_blob_test_SOURCES =				\

					tests/blob_test.c

				tests_blob_test_LDADD =					\

					$(top_builddir)/src/glsl/libglsl.la

				glsl_tests_blob_test_SOURCES =				\

					glsl/tests/blob_test.c

				glsl_tests_blob_test_LDADD =				\

					glsl/libglsl.la

				tests_general_ir_test_SOURCES =		\

					standalone_scaffolding.cpp			\

					tests/builtin_variable_test.cpp			\

					tests/invalidate_locations_test.cpp		\

					tests/general_ir_test.cpp			\

					tests/varyings_test.cpp

				tests_general_ir_test_CFLAGS =				\

				glsl_tests_general_ir_test_SOURCES =			\

					glsl/tests/builtin_variable_test.cpp		\

					glsl/tests/invalidate_locations_test.cpp	\

					glsl/tests/general_ir_test.cpp			\

					glsl/tests/varyings_test.cpp

				glsl_tests_general_ir_test_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				tests_general_ir_test_LDADD =				\

				glsl_tests_general_ir_test_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					$(top_builddir)/src/glsl/libglsl.la		\

					glsl/libglsl.la		\

					glsl/libstandalone.la				\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				tests_uniform_initializer_test_SOURCES =		\

					tests/copy_constant_to_storage_tests.cpp	\

					tests/set_uniform_initializer_tests.cpp		\

					tests/uniform_initializer_utils.cpp		\

					tests/uniform_initializer_utils.h

				tests_uniform_initializer_test_CFLAGS =			\

				glsl_tests_uniform_initializer_test_SOURCES =		\

					glsl/tests/copy_constant_to_storage_tests.cpp	\

					glsl/tests/set_uniform_initializer_tests.cpp	\

					glsl/tests/uniform_initializer_utils.cpp	\

					glsl/tests/uniform_initializer_utils.h

				glsl_tests_uniform_initializer_test_CFLAGS =		\

					$(PTHREAD_CFLAGS)

				tests_uniform_initializer_test_LDADD =			\

				glsl_tests_uniform_initializer_test_LDADD =		\

					$(top_builddir)/src/gtest/libgtest.la		\

					$(top_builddir)/src/glsl/libglsl.la		\

					glsl/libglsl.la		\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				tests_sampler_types_test_SOURCES =			\

					tests/sampler_types_test.cpp

				tests_sampler_types_test_CFLAGS =			\

				glsl_tests_sampler_types_test_SOURCES =			\

					glsl/tests/sampler_types_test.cpp

				glsl_tests_sampler_types_test_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				tests_sampler_types_test_LDADD =			\

				glsl_tests_sampler_types_test_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					$(top_builddir)/src/glsl/libglsl.la		\

					glsl/libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				libglcpp_la_LIBADD =					\

				noinst_LTLIBRARIES += glsl/libglsl.la glsl/libglcpp.la glsl/libstandalone.la

				glsl_libglcpp_la_LIBADD =				\

					$(top_builddir)/src/util/libmesautil.la

				libglcpp_la_SOURCES =					\

					glcpp/glcpp-lex.c				\

					glcpp/glcpp-parse.c				\

					glcpp/glcpp-parse.h				\

				glsl_libglcpp_la_SOURCES =				\

					glsl/glcpp/glcpp-lex.c				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-parse.h			\

					$(LIBGLCPP_FILES)

				glcpp_glcpp_SOURCES =					\

					glcpp/glcpp.c

				glcpp_glcpp_LDADD =					\

					libglcpp.la					\

				glsl_glcpp_glcpp_SOURCES =				\

					glsl/glcpp/glcpp.c

				glsl_glcpp_glcpp_LDADD =				\

					glsl/libglcpp.la	\

					$(top_builddir)/src/libglsl_util.la		\

					-lm

				libglsl_la_LIBADD = \

					$(top_builddir)/src/compiler/nir/libnir.la \

					libglcpp.la

				glsl_libglsl_la_LIBADD = \

					nir/libnir.la \

					glsl/libglcpp.la

				libglsl_la_SOURCES =					\

					glsl_lexer.cpp					\

					glsl_parser.cpp					\

					glsl_parser.h					\

				glsl_libglsl_la_SOURCES =				\

					glsl/glsl_lexer.cpp				\

					glsl/glsl_parser.cpp				\

					glsl/glsl_parser.h				\

					$(LIBGLSL_FILES)

				glsl_compiler_SOURCES = \

				glsl_libstandalone_la_SOURCES = \

					$(GLSL_COMPILER_CXX_FILES)

				glsl_compiler_LDADD =					\

					libglsl.la					\

				glsl_libstandalone_la_LIBADD =				\

					glsl/libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(top_builddir)/src/util/libmesautil.la		\

					$(PTHREAD_LIBS)

				glsl_test_SOURCES = \

					standalone_scaffolding.cpp \

					test.cpp \

					test_optpass.cpp \

					test_optpass.h

				glsl_compiler_SOURCES = \

					glsl/main.cpp

				glsl_test_LDADD =					\

					libglsl.la					\

				glsl_compiler_LDADD = \

					glsl/libstandalone.la

				glsl_glsl_test_SOURCES = \

					glsl/test.cpp \

					glsl/test_optpass.cpp \

					glsl/test_optpass.h

				glsl_glsl_test_LDADD =					\

					glsl/libglsl.la					\

					glsl/libstandalone.la				\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				@@ -186,23 +178,24 @@ am__v_YACC_ = $(am__v_YACC_$(AM_DEFAULT_VERBOSITY))

				am__v_YACC_0 = @echo "  YACC    " $@;

				am__v_YACC_1 =

				MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)

				YACC_GEN = $(AM_V_YACC)$(YACC) $(YFLAGS)

				LEX_GEN = $(AM_V_LEX)$(LEX) $(LFLAGS)

				glsl_parser.cpp glsl_parser.h: glsl_parser.yy

					$(YACC_GEN) -o $@ -p "_mesa_glsl_" --defines=$(builddir)/glsl_parser.h $(srcdir)/glsl_parser.yy

				glsl_lexer.cpp: glsl_lexer.ll

					$(LEX_GEN) -o $@ $(srcdir)/glsl_lexer.ll

				glcpp/glcpp-parse.c glcpp/glcpp-parse.h: glcpp/glcpp-parse.y

				glsl/glsl_parser.cpp glsl/glsl_parser.h: glsl/glsl_parser.yy

					$(MKDIR_GEN)

					$(YACC_GEN) -o $@ -p "glcpp_parser_" --defines=$(builddir)/glcpp/glcpp-parse.h $(srcdir)/glcpp/glcpp-parse.y

					$(YACC_GEN) -o $@ -p "_mesa_glsl_" --defines=$(builddir)/glsl/glsl_parser.h $(srcdir)/glsl/glsl_parser.yy

				glcpp/glcpp-lex.c: glcpp/glcpp-lex.l

				glsl/glsl_lexer.cpp: glsl/glsl_lexer.ll

					$(MKDIR_GEN)

					$(LEX_GEN) -o $@ $(srcdir)/glcpp/glcpp-lex.l

					$(LEX_GEN) -o $@ $(srcdir)/glsl/glsl_lexer.ll

				glsl/glcpp/glcpp-parse.c glsl/glcpp/glcpp-parse.h: glsl/glcpp/glcpp-parse.y

					$(MKDIR_GEN)

					$(YACC_GEN) -o $@ -p "glcpp_parser_" --defines=$(builddir)/glsl/glcpp/glcpp-parse.h $(srcdir)/glsl/glcpp/glcpp-parse.y

				glsl/glcpp/glcpp-lex.c: glsl/glcpp/glcpp-lex.l

					$(MKDIR_GEN)

					$(LEX_GEN) -o $@ $(srcdir)/glsl/glcpp/glcpp-lex.l

				# Only the parsers (specifically the header files generated at the same time)

				# need to be in BUILT_SOURCES. Though if we list the parser headers YACC is

				@@ -210,19 +203,22 @@ glcpp/glcpp-lex.c: glcpp/glcpp-lex.l

				# YACC is only executed once for each parser. The rest of the generated code

				# will be created at the appropriate times according to standard automake

				# dependency rules.

				BUILT_SOURCES =						\

					glsl_parser.cpp					\

					glsl_lexer.cpp					\

					glcpp/glcpp-parse.c				\

					glcpp/glcpp-lex.c

				CLEANFILES =						\

					glcpp/glcpp-parse.h				\

					glsl_parser.h					\

					$(BUILT_SOURCES)

				BUILT_SOURCES +=					\

					glsl/glsl_parser.cpp				\

					glsl/glsl_lexer.cpp				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-lex.c

				CLEANFILES +=						\

					glsl/glcpp/glcpp-parse.h			\

					glsl/glsl_parser.h				\

					glsl/glsl_parser.cpp				\

					glsl/glsl_lexer.cpp				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-lex.c

				clean-local:

					$(RM) -r subtest-cr subtest-cr-lf subtest-lf subtest-lf-cr

				dist-hook:

					$(RM) glcpp/tests/*.out

					$(RM) glcpp/tests/subtest*/*.out

					$(RM) glsl/glcpp/tests/*.out

					$(RM) glsl/glcpp/tests/subtest*/*.out

									
										90

src/compiler/Makefile.nir.am
									
										Normal file
									
												View File
												
				@@ -0,0 +1,90 @@

				#

				# Copyright © 2012 Jon TURNEY

				# Copyright (C) 2015 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				noinst_LTLIBRARIES += nir/libnir.la

				nir_libnir_la_LIBADD = \

					libcompiler.la

				nir_libnir_la_SOURCES =					\

					$(NIR_FILES)					\

					$(SPIRV_FILES)					\

					$(NIR_GENERATED_FILES)

				PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)

				nir/nir_builder_opcodes.h: nir/nir_opcodes.py nir/nir_builder_opcodes_h.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_builder_opcodes_h.py > $@ || ($(RM) $@; false)

				nir/nir_constant_expressions.c: nir/nir_opcodes.py nir/nir_constant_expressions.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_constant_expressions.py > $@ || ($(RM) $@; false)

				nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_h.py > $@ || ($(RM) $@; false)

				nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_c.py > $@ || ($(RM) $@; false)

				nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opt_algebraic.py > $@ || ($(RM) $@; false)

				check_PROGRAMS += nir/tests/control_flow_tests

				nir_tests_control_flow_tests_CPPFLAGS = \

					$(AM_CPPFLAGS) \

					-I$(top_builddir)/src/compiler/nir \

					-I$(top_srcdir)/src/compiler/nir

				nir_tests_control_flow_tests_SOURCES =			\

					nir/tests/control_flow_tests.cpp

				nir_tests_control_flow_tests_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				nir_tests_control_flow_tests_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					nir/libnir.la	\

					$(top_builddir)/src/util/libmesautil.la		\

					$(PTHREAD_LIBS)

				TESTS += nir/tests/control_flow_tests

				BUILT_SOURCES += $(NIR_GENERATED_FILES)

				CLEANFILES += $(NIR_GENERATED_FILES)

				EXTRA_DIST += \

					nir/nir_algebraic.py				\

					nir/nir_builder_opcodes_h.py			\

					nir/nir_constant_expressions.py			\

					nir/nir_opcodes.py				\

					nir/nir_opcodes_c.py				\

					nir/nir_opcodes_h.py				\

					nir/nir_opt_algebraic.py			\

					nir/tests \

					SConscript.nir

									
										46

src/compiler/Makefile.sources
									
												View File
												
				@@ -25,6 +25,8 @@ LIBGLSL_FILES = \

					glsl/glsl_parser_extras.h \

					glsl/glsl_symbol_table.cpp \

					glsl/glsl_symbol_table.h \

					glsl/glsl_to_nir.cpp \

					glsl/glsl_to_nir.h \

					glsl/hir_field_selection.cpp \

					glsl/ir_basic_block.cpp \

					glsl/ir_basic_block.h \

				@@ -77,10 +79,10 @@ LIBGLSL_FILES = \

					glsl/loop_unroll.cpp \

					glsl/lower_buffer_access.cpp \

					glsl/lower_buffer_access.h \

					glsl/lower_clip_distance.cpp \

					glsl/lower_const_arrays_to_uniforms.cpp \

					glsl/lower_discard.cpp \

					glsl/lower_discard_flow.cpp \

					glsl/lower_distance.cpp \

					glsl/lower_if_to_cond_assign.cpp \

					glsl/lower_instructions.cpp \

					glsl/lower_jumps.cpp \

				@@ -129,6 +131,7 @@ LIBGLSL_FILES = \

					glsl/opt_tree_grafting.cpp \

					glsl/opt_vectorize.cpp \

					glsl/program.h \

					glsl/propagate_invariance.cpp \

					glsl/s_expression.cpp \

					glsl/s_expression.h

				@@ -137,7 +140,8 @@ LIBGLSL_FILES = \

				GLSL_COMPILER_CXX_FILES = \

					glsl/standalone_scaffolding.cpp \

					glsl/standalone_scaffolding.h \

					glsl/main.cpp

					glsl/standalone.cpp \

					glsl/standalone.h

				# libglsl generated sources

				LIBGLSL_GENERATED_CXX_FILES = \

				@@ -162,8 +166,6 @@ NIR_GENERATED_FILES = \

					nir/nir_opt_algebraic.c

				NIR_FILES = \

					nir/glsl_to_nir.cpp \

					nir/glsl_to_nir.h \

					nir/nir.c \

					nir/nir.h \

					nir/nir_array.h \

				@@ -175,23 +177,34 @@ NIR_FILES = \

					nir/nir_control_flow_private.h \

					nir/nir_dominance.c \

					nir/nir_from_ssa.c \

					nir/nir_gather_info.c \

					nir/nir_gs_count_vertices.c \

					nir/nir_intrinsics.c \

					nir/nir_intrinsics.h \

					nir/nir_inline_functions.c \

					nir/nir_instr_set.c \

					nir/nir_instr_set.h \

					nir/nir_intrinsics.c \

					nir/nir_intrinsics.h \

					nir/nir_liveness.c \

					nir/nir_lower_alu_to_scalar.c \

					nir/nir_lower_atomics.c \

					nir/nir_lower_bitmap.c \

					nir/nir_lower_clamp_color_outputs.c \

					nir/nir_lower_clip.c \

					nir/nir_lower_double_ops.c \

					nir/nir_lower_double_packing.c \

					nir/nir_lower_drawpixels.c \

					nir/nir_lower_global_vars_to_local.c \

					nir/nir_lower_gs_intrinsics.c \

					nir/nir_lower_load_const_to_scalar.c \

					nir/nir_lower_locals_to_regs.c \

					nir/nir_lower_idiv.c \

					nir/nir_lower_indirect_derefs.c \

					nir/nir_lower_io.c \

					nir/nir_lower_outputs_to_temporaries.c \

					nir/nir_lower_io_to_temporaries.c \

					nir/nir_lower_io_types.c \

					nir/nir_lower_passthrough_edgeflags.c \

					nir/nir_lower_phis_to_scalar.c \

					nir/nir_lower_returns.c \

					nir/nir_lower_samplers.c \

					nir/nir_lower_system_values.c \

					nir/nir_lower_tex.c \

				@@ -200,6 +213,8 @@ NIR_FILES = \

					nir/nir_lower_vars_to_ssa.c \

					nir/nir_lower_var_copies.c \

					nir/nir_lower_vec_to_movs.c \

					nir/nir_lower_wpos_center.c \

					nir/nir_lower_wpos_ytransform.c \

					nir/nir_metadata.c \

					nir/nir_move_vec_src_uses_to_dest.c \

					nir/nir_normalize_cubemap_coords.c \

				@@ -213,8 +228,12 @@ NIR_FILES = \

					nir/nir_opt_peephole_select.c \

					nir/nir_opt_remove_phis.c \

					nir/nir_opt_undef.c \

					nir/nir_phi_builder.c \

					nir/nir_phi_builder.h \

					nir/nir_print.c \

					nir/nir_propagate_invariant.c \

					nir/nir_remove_dead_variables.c \

					nir/nir_repair_ssa.c \

					nir/nir_search.c \

					nir/nir_search.h \

					nir/nir_split_var_copies.c \

				@@ -224,3 +243,16 @@ NIR_FILES = \

					nir/nir_vla.h \

					nir/nir_worklist.c \

					nir/nir_worklist.h

				SPIRV_FILES = \

					spirv/GLSL.std.450.h \

					spirv/nir_spirv.h \

					spirv/spirv.h \

					spirv/spirv_info.h \

					spirv/spirv_info.c \

					spirv/spirv_to_nir.c \

					spirv/vtn_alu.c \

					spirv/vtn_cfg.c \

					spirv/vtn_glsl450.c \

					spirv/vtn_private.h \

					spirv/vtn_variables.c

									
										3

src/compiler/SConscript
									
												View File
												
				@@ -21,4 +21,5 @@ compiler = env.ConvenienceLibrary(

				)

				Export('compiler')

				SConscript('glsl/SConscript')

				SConscript('SConscript.glsl')

				SConscript('SConscript.nir')

43

src/compiler/glsl/SConscript → src/compiler/SConscript.glsl

View File

@@ -15,14 +15,17 @@ env.Prepend(CPPPATH = [
     '#src/mesa',
     '#src/gallium/include',
     '#src/gallium/auxiliary',
     '#src/glsl',
     '#src/glsl/glcpp',
     '#src/compiler/glsl',
     '#src/compiler/glsl/glcpp',
     '#src/compiler/nir',
 ])
 env.Prepend(LIBS = [mesautil])
 # Make glcpp-parse.h and glsl_parser.h reachable from the include path.
 env.Append(CPPPATH = [Dir('.').abspath, Dir('glcpp').abspath])
 env.Prepend(CPPPATH = [Dir('.').abspath, Dir('glsl').abspath])
 # Make NIR headers reachable from the include path.
 env.Prepend(CPPPATH = [Dir('.').abspath, Dir('nir').abspath])
 glcpp_env = env.Clone()
 glcpp_env.Append(YACCFLAGS = [
@@ -32,7 +35,7 @@ glcpp_env.Append(YACCFLAGS = [
 glsl_env = env.Clone()
 glsl_env.Append(YACCFLAGS = [
     '--defines=%s' % File('glsl_parser.h').abspath,
     '--defines=%s' % File('glsl/glsl_parser.h').abspath,
     '-p', '_mesa_glsl_',
 ])
@@ -40,10 +43,10 @@ glsl_env.Append(YACCFLAGS = [
 # "glsl_parser.h", causing glsl_parser.cpp to be regenerated every time
 glsl_env['YACCHXXFILESUFFIX'] = '.h'
 glcpp_lexer = glcpp_env.CFile('glcpp/glcpp-lex.c', 'glcpp/glcpp-lex.l')
 glcpp_parser = glcpp_env.CFile('glcpp/glcpp-parse.c', 'glcpp/glcpp-parse.y')
 glsl_lexer = glsl_env.CXXFile('glsl_lexer.cpp', 'glsl_lexer.ll')
 glsl_parser = glsl_env.CXXFile('glsl_parser.cpp', 'glsl_parser.yy')
 glcpp_lexer = glcpp_env.CFile('glsl/glcpp/glcpp-lex.c', 'glsl/glcpp/glcpp-lex.l')
 glcpp_parser = glcpp_env.CFile('glsl/glcpp/glcpp-parse.c', 'glsl/glcpp/glcpp-parse.y')
 glsl_lexer = glsl_env.CXXFile('glsl/glsl_lexer.cpp', 'glsl/glsl_lexer.ll')
 glsl_parser = glsl_env.CXXFile('glsl/glsl_parser.cpp', 'glsl/glsl_parser.yy')
 # common generated sources
 glsl_sources = [
@@ -51,7 +54,7 @@ glsl_sources = [
     glcpp_parser[0],
     glsl_lexer,
     glsl_parser[0],
 ]
 ]
 # parse Makefile.sources
 source_lists = env.ParseSourceList('Makefile.sources')
@@ -66,20 +69,20 @@ if env['msvc']:
 # Copy these files to avoid generation object files into src/mesa/program
 env.Prepend(CPPPATH = ['#src/mesa/main'])
 env.Command('imports.c', '#src/mesa/main/imports.c', Copy('$TARGET', '$SOURCE'))
 env.Command('glsl/imports.c', '#src/mesa/main/imports.c', Copy('$TARGET', '$SOURCE'))
 # Copy these files to avoid generation object files into src/mesa/program
 env.Prepend(CPPPATH = ['#src/mesa/program'])
 env.Command('prog_hash_table.c', '#src/mesa/program/prog_hash_table.c', Copy('$TARGET', '$SOURCE'))
 env.Command('symbol_table.c', '#src/mesa/program/symbol_table.c', Copy('$TARGET', '$SOURCE'))
 env.Command('dummy_errors.c', '#src/mesa/program/dummy_errors.c', Copy('$TARGET', '$SOURCE'))
 env.Command('glsl/prog_hash_table.c', '#src/mesa/program/prog_hash_table.c', Copy('$TARGET', '$SOURCE'))
 env.Command('glsl/symbol_table.c', '#src/mesa/program/symbol_table.c', Copy('$TARGET', '$SOURCE'))
 env.Command('glsl/dummy_errors.c', '#src/mesa/program/dummy_errors.c', Copy('$TARGET', '$SOURCE'))
 compiler_objs = env.StaticObject(source_lists['GLSL_COMPILER_CXX_FILES'])
 mesa_objs = env.StaticObject([
     'imports.c',
     'prog_hash_table.c',
     'symbol_table.c',
     'dummy_errors.c',
     'glsl/imports.c',
     'glsl/prog_hash_table.c',
     'glsl/symbol_table.c',
     'glsl/dummy_errors.c',
 ])
 compiler_objs += mesa_objs
@@ -109,6 +112,8 @@ if env['platform'] == 'windows':
 env.Prepend(LIBS = [compiler, glsl])
 compiler_objs += env.StaticObject("glsl/main.cpp")
 glsl_compiler = env.Program(
     target = 'glsl_compiler',
     source = compiler_objs,
@@ -116,7 +121,7 @@ glsl_compiler = env.Program(
 env.Alias('glsl_compiler', glsl_compiler)
 glcpp = env.Program(
     target = 'glcpp/glcpp',
     source = ['glcpp/glcpp.c'] + mesa_objs,
     target = 'glsl/glcpp/glcpp',
     source = ['glsl/glcpp/glcpp.c'] + mesa_objs,
 )
 env.Alias('glcpp', glcpp)

73

src/compiler/SConscript.nir Normal file

View File

@@ -0,0 +1,73 @@
 import common
 Import('*')
 from sys import executable as python_cmd
 env = env.Clone()
 env.MSVC2013Compat()
 env.Prepend(CPPPATH = [
     '#include',
     '#src',
     '#src/mapi',
     '#src/mesa',
     '#src/gallium/include',
     '#src/gallium/auxiliary',
     '#src/compiler/nir',
 ])
 # Make generated headers reachable from the include path.
 env.Prepend(CPPPATH = [Dir('.').abspath, Dir('nir').abspath])
 # nir generated sources
 nir_builder_opcodes_h = env.CodeGenerate(
     target = 'nir/nir_builder_opcodes.h',
     script = 'nir/nir_builder_opcodes_h.py',
     source = [],
     command = python_cmd + ' $SCRIPT > $TARGET'
 )
 env.CodeGenerate(
     target = 'nir/nir_constant_expressions.c',
     script = 'nir/nir_constant_expressions.py',
     source = [],
     command = python_cmd + ' $SCRIPT > $TARGET'
 )
 env.CodeGenerate(
     target = 'nir/nir_opcodes.h',
     script = 'nir/nir_opcodes_h.py',
     source = [],
     command = python_cmd + ' $SCRIPT > $TARGET'
 )
 env.CodeGenerate(
     target = 'nir/nir_opcodes.c',
     script = 'nir/nir_opcodes_c.py',
     source = [],
     command = python_cmd + ' $SCRIPT > $TARGET'
 )
 env.CodeGenerate(
     target = 'nir/nir_opt_algebraic.c',
     script = 'nir/nir_opt_algebraic.py',
     source = [],
     command = python_cmd + ' $SCRIPT > $TARGET'
 )
 # parse Makefile.sources
 source_lists = env.ParseSourceList('Makefile.sources')
 nir_sources = source_lists['NIR_FILES']
 nir_sources += source_lists['NIR_GENERATED_FILES']
 nir = env.ConvenienceLibrary(
     target = 'nir',
     source = nir_sources,
 )
 env.Alias('nir', nir)
 Export('nir')

Compare commits

6241 Commits mesa-11.2. ... 12.0

3 .gitignore vendored Unescape Escape View File

460 .mailmap Normal file Unescape Escape View File

28 .travis.yml Unescape Escape View File

13 Android.common.mk Unescape Escape View File

14 Android.mk Unescape Escape View File

16 Makefile.am Unescape Escape View File

106 REVIEWERS Normal file Unescape Escape View File

19 SConstruct Unescape Escape View File

2 VERSION Unescape Escape View File

9 appveyor.yml Unescape Escape View File

28 bin/.cherry-ignore Normal file Unescape Escape View File

2 bin/bugzilla_mesa.sh Unescape Escape View File

35 bin/get-extra-pick-list.sh Executable file Unescape Escape View File

2 bin/get-pick-list.sh Unescape Escape View File

39 bin/get-typod-pick-list.sh Executable file Unescape Escape View File

1 common.py Unescape Escape View File

414 configure.ac Unescape Escape View File

490 docs/COPYING Unescape Escape View File

378 docs/GL3.txt Unescape Escape View File

4 docs/download.html Unescape Escape View File

8 docs/egl.html Unescape Escape View File

4 docs/envvars.html Unescape Escape View File

27 docs/index.html Unescape Escape View File

3 docs/install.html Unescape Escape View File

14 docs/license.html Unescape Escape View File

5 docs/relnotes.html Unescape Escape View File

319 docs/relnotes/11.1.3.html Normal file Unescape Escape View File

182 docs/relnotes/11.1.4.html Normal file Unescape Escape View File

219 docs/relnotes/11.2.0.html Unescape Escape View File

119 docs/relnotes/11.2.1.html Normal file Unescape Escape View File

210 docs/relnotes/11.2.2.html Normal file Unescape Escape View File

335 docs/relnotes/12.0.0.html Normal file Unescape Escape View File

67 docs/relnotes/12.0.1.html Normal file Unescape Escape View File

403 docs/relnotes/12.0.2.html Normal file Unescape Escape View File

71 docs/relnotes/12.0.3.html Normal file Unescape Escape View File

321 docs/relnotes/12.0.4.html Normal file Unescape Escape View File

138 docs/relnotes/12.0.5.html Normal file Unescape Escape View File

148 docs/relnotes/12.0.6.html Normal file Unescape Escape View File

34 docs/repository.html Unescape Escape View File

45 docs/shading.html Unescape Escape View File

3 docs/systems.html Unescape Escape View File

2 docs/utilities.html Unescape Escape View File

8 doxygen/.gitignore vendored Unescape Escape View File

7 doxygen/Makefile Unescape Escape View File

51 doxygen/common.doxy Unescape Escape View File

3 doxygen/core_subset.doxy Unescape Escape View File

9 doxygen/doxy.bat Unescape Escape View File

6 doxygen/gbm.doxy Unescape Escape View File

8 doxygen/glapi.doxy Unescape Escape View File

9 doxygen/glsl.doxy Unescape Escape View File

2 doxygen/header.html Unescape Escape View File

1 doxygen/header_subset.html Unescape Escape View File

2 doxygen/i965.doxy Unescape Escape View File

1 doxygen/main.doxy Unescape Escape View File

2 doxygen/math.doxy Unescape Escape View File

43 doxygen/shader.doxy → doxygen/nir.doxy Unescape Escape View File

3 doxygen/radeon_subset.doxy Unescape Escape View File

4 doxygen/swrast.doxy Unescape Escape View File

2 doxygen/swrast_setup.doxy Unescape Escape View File

9 doxygen/tnl.doxy Unescape Escape View File

5 doxygen/tnl_dd.doxy Unescape Escape View File

3 doxygen/vbo.doxy Unescape Escape View File

10 include/D3D9/d3d9.h Unescape Escape View File

20 include/D3D9/d3d9types.h Unescape Escape View File

11 include/EGL/eglmesaext.h Unescape Escape View File

108 include/GL/glcorearb.h Unescape Escape View File

87 include/GL/glext.h Unescape Escape View File

70 include/GL/internal/dri_interface.h Unescape Escape View File

304 include/GL/mesa_glinterop.h Normal file Unescape Escape View File

35 include/c11/threads_posix.h Unescape Escape View File

48 include/c99_compat.h Unescape Escape View File

23 include/c99_math.h Unescape Escape View File

6 include/d3dadapter/drm.h Unescape Escape View File

7 include/d3dadapter/present.h Unescape Escape View File

6 include/pci_ids/i965_pci_ids.h Unescape Escape View File

22 include/pci_ids/radeonsi_pci_ids.h Unescape Escape View File

1 include/pci_ids/virtio_gpu_pci_ids.h Unescape Escape View File

85 include/vulkan/vk_icd.h Normal file Unescape Escape View File

6241 Commits

mesa-11.2. ... 12.0

3

.gitignore vendored

View File

460

.mailmap Normal file

View File

28

.travis.yml

View File

13

Android.common.mk

View File

14

Android.mk

View File

16

Makefile.am

View File

106

REVIEWERS Normal file

View File

19

SConstruct

View File

2

VERSION

View File

9

appveyor.yml

View File

28

bin/.cherry-ignore Normal file

View File

2

bin/bugzilla_mesa.sh

View File

35

bin/get-extra-pick-list.sh Executable file

View File

2

bin/get-pick-list.sh

View File

39

bin/get-typod-pick-list.sh Executable file

View File

1

common.py

View File

414

configure.ac

View File

490

docs/COPYING

View File

378

docs/GL3.txt

View File

4

docs/download.html

View File

8

docs/egl.html

View File

4

docs/envvars.html

View File

27

docs/index.html

View File

3

docs/install.html

View File

14

docs/license.html

View File

5

docs/relnotes.html

View File

319

docs/relnotes/11.1.3.html Normal file

View File

182

docs/relnotes/11.1.4.html Normal file

View File

219

docs/relnotes/11.2.0.html

View File

119

docs/relnotes/11.2.1.html Normal file

View File

210

docs/relnotes/11.2.2.html Normal file

View File

335

docs/relnotes/12.0.0.html Normal file

View File

67

docs/relnotes/12.0.1.html Normal file

View File

403

docs/relnotes/12.0.2.html Normal file

View File

71

docs/relnotes/12.0.3.html Normal file

View File

321

docs/relnotes/12.0.4.html Normal file

View File

138

docs/relnotes/12.0.5.html Normal file

View File

148

docs/relnotes/12.0.6.html Normal file

View File

34

docs/repository.html

View File

45

docs/shading.html

View File

3

docs/systems.html

View File

2

docs/utilities.html

View File

8

doxygen/.gitignore vendored

View File

7

doxygen/Makefile

View File

51

doxygen/common.doxy

View File

3

doxygen/core_subset.doxy

View File

9

doxygen/doxy.bat

View File

6

doxygen/gbm.doxy

View File

8

doxygen/glapi.doxy

View File

9

doxygen/glsl.doxy

View File

2

doxygen/header.html

View File

1

doxygen/header_subset.html

View File

2

doxygen/i965.doxy

View File

1

doxygen/main.doxy

View File

2

doxygen/math.doxy

View File

43

doxygen/shader.doxy → doxygen/nir.doxy

View File

3

doxygen/radeon_subset.doxy

View File

4

doxygen/swrast.doxy

View File

2

doxygen/swrast_setup.doxy

View File

9

doxygen/tnl.doxy

View File

5

doxygen/tnl_dd.doxy

View File

3

doxygen/vbo.doxy

View File

10

include/D3D9/d3d9.h

View File

20

include/D3D9/d3d9types.h

View File

11

include/EGL/eglmesaext.h

View File

108

include/GL/glcorearb.h

View File

87

include/GL/glext.h

View File

70

include/GL/internal/dri_interface.h

View File

304

include/GL/mesa_glinterop.h Normal file

View File

35

include/c11/threads_posix.h

View File

48

include/c99_compat.h

View File

23

include/c99_math.h

View File

6

include/d3dadapter/drm.h

View File

7

include/d3dadapter/present.h

View File

6

include/pci_ids/i965_pci_ids.h

View File

22

include/pci_ids/radeonsi_pci_ids.h

View File

1

include/pci_ids/virtio_gpu_pci_ids.h

View File

85

include/vulkan/vk_icd.h Normal file

View File

127

include/vulkan/vk_platform.h Normal file

View File